This course is intended to be a modular immersion series for people interested in using Python for data-related tasks. This course is composed of 5 initial sections which will take the learner from zero knowledge to a basic, well-rounded scope of data analysis knowledge using the Python programming language. More sections are planned and will be added to the syllabus and GitHub repository as soon as they are reviewed.
The 5 initial sections are:
- Introduction to Python Programming Language - Part 1
- This is intended for newcomers to Python who have little or no knowledge of how to get started. The course and the learning objectives will be reviewed, special considerations for learners will be discussed, and time will be given for questions.
- Introduction to Python Programming Language - Part 2
- For those who are new or need a refresher, this section is meant to expose learners to the Python programming language. Understanding of the programming language will be important for applying its functionality to later sections.
- Data Collection
- In this section, we will focus on how to gather and process data for analysis. This will cover a wide range of data formats and ETL techniques.
- Data Analysis
- After processing and formatting the data, this section will cover specialized libraries and techniques for doing data analysis with Python. This will cover basic statistical learning concepts and processes for doing basic manipulation and analysis.
- Data Visualization
- The final section will cover topics on visualizing your data. This will cover useful libraries that can display data in visualizations ranging from simple charts to complex maps.
The readings for each section are suggested and learners are encouraged to find and use alternative materials to accomplish the learning goals for each section. Understanding that some of the texts can be expensive, free online tutorials and blog posts are more than fine for the purpose of learning and adapting Python to data analysis tasks.
Everyone will work through the materials and help each other in understanding the exercises and challenges over the course of each month. This is an informal learning experience in which students are encouraged to contribute and change course materials as the course progresses.
- Gain a working understanding of the Python programming language
- Gain a working knowledge of how to collect and process data in Python
- Gain a good understanding of how to perform data manipulation and analysis in Python
- Gain a good understanding of how to visualize your analytic results using Python
The immersion course learning group will meet based upon a schedule established during the first meeting. Each meeting, we will:
- Discuss the previous month’s readings
- Share experiences with the materials
- Discuss specific areas of interest for which the learners found challenges in using Python or applying concepts to specific data science tasks.
The Bluegrass Developers Guild Slack will be the online common place for learners to discuss the immersion course content. Our in-person meeting location will be decided no less than one week prior to each event (usually this will be know at least a month in advance). All in-person events will be scheduled and posted via the Bluegrass Data Science Group Meetup site: https://www.meetup.com/Bluegrass-Data-Science-Group/
Please go here to join the Bluegrass Developers Guild Slack if you are not already invited (invitations will be validated ASAP): https://www.bluegrassdevs.org/
Section 1
- Introduction, Setup, Hello World
- Course Outline Expectations
- Operating Systems
- Virtual Environments
- Python 3.X
- Anaconda
- IDEs
- Jupyter Lab/Notebooks
- Spyder
- PyCharm
- Text editors
- VCS
- GitHub
- GitLab
- BitBucket
- Installing
- Virtual Environments
- Resources
- Q&A
Section 2
- Introduction to Python Programming Language
- Basics
- Flow Control
- Functions
- Data Structures
Section 3
- Data Collection
Section 4
- Data Analysis
Section 5
- Data Visualization
Section 1
-
Setup
- Review setup instructions for your OS
- Install Python 3.5 or higher
- Install IDE/Text Editor or your choice
- Setup a virtual environment
- Create a GitHub account and complete basic tutorials
- Obtain any learning materials you need for learning Python
- Join the Bluegrass Developers Guild Slack
-
Beginning with Python
- Create a “Hello World” program in Python
- Prepare for Section 2 (readings)
-
Think about a Project
- Create or join a project where you can apply your skills
- Find something at work to automate
- Find a school project
- Find a community organization that may need help
- Find others who may want to work on a projects
- Start planning on how to move forward
- Create or join a project where you can apply your skills
Section 2
- Python Basics
- Do the exercises in the Juptyer Notebook
- Start reading Python for Data Analysis by Wes McKinney
- See GitHub for Notebooks
Section 3
- Data Collection
- Python for Data Analysis by Wes McKinney
- Chapter 6
- Automate the Boring Stuff
- Chapter 11-14
- Python for Data Analysis by Wes McKinney
Section 4
- Data Analysis
- Introduction to Statistical Learning with Python by Thomas Haslwanter
- Clone the repo into a virtual environment
- Run through all the ipynb_slides
- Run through the ISP and ipynb code samples
- Python for Data Analysis by Wes McKinney
- Chapters 4,5,7,10
- Eventually, read the whole thing (believe me, it’s worth it)
- Work through as much of the code as possible
- Introduction to Statistical Learning with Python by Thomas Haslwanter
Section 5
- Data Visualization
- Python for Data Analysis by Wes McKinney
- Chapter 8,9
- Python for Data Analysis by Wes McKinney