Python Data Analysis Training Course
Learn the fundamentals of Data Analysis with Python, and how to apply it practically

Course Overview
Duration
- 10 Half-day lectures or 5 Full Days
What do I need?
- Webinar : A laptop, and a stable internet connection. The recommended minimum speed is around 10 Mbps.
- Classroom Training : A laptop, please notify us if you are not bringing your own laptop. Please see the calendar below for the schedule
Certification
- Attendance : If you have attended 80% of the sessions and completed all the class work, you qualify for the Attendance Certificate. (Course Price : R14 995 based on minimum of 4 students)
- Competency : Only offered as part of a Coding Bootcamp
Pre-requisites
- You should be at the level of our Python Training Course, before attempting this course
Delivery
- In-Person (Woodmead Classroom)
- Remote (Discord Webinar)
Price
- R14 500
Who will benefit
- Students who have just mastered Pyton and want to explore the area of Data Analysis.
- Management / Statisticians that want to get empowered by being able to learn how to use data productively
What you will learn
- Understand the Python for Dada Analysis Eco System, Communities, Tools and Libraries
- Understand IPython and Jupyter programming environments
- An Overview of the Python Language
- How to use NumPy and array-oriented computing in Python.
- How to read and write datasets with Pandas.
- How to visualise data with Pandas
- How to get access to data
- How to clean and prep data
- How to do Data Wrangling (Join, Combine, Reshape)
- How to visualise data with Oandas, Matplotlib and Seaborn
- How to do Data Agregation and Group Operations
- How to usse different types of analysis and data transformation tools to work with Time series
- How to use modelling toolkits like statsmodels and scikit-learn
- How to do a real-world data analysis project
Course Details
Unit 1
Preliminaries
- Course Overview
- Why Python for Data Analysis?
- Essential Python Libraries
Python Language Basics, IPython, and Jupyter Notebooks
- The Python Interpreter
- IPython Basics
- Python Language Basics
Built-In Data Structures, Functions, and Files
- Data Structures and Sequences
- Functions
- Files and the Operating System
Unit 2
NumPy Basics: Arrays and Vectorized Computation
- The NumPy ndarray: A Multidimensional Array Object
- Pseudorandom Number Generation
- Universal Functions: Fast Element-Wise Array Functions
- Array-Oriented Programming with Arrays
- File Input and Output with Arrays
- Linear Algebra
- Random Walks
Getting Started with Pandas
- Introduction to pandas Data Structures
- Essential Functionality
- Summarizing and Computing Descriptive Statistics
Data Loading, Storage, and File Formats
- Reading and Writing Data in Text Format
- Binary Data Formats
- Interacting with Web APIs
- Interacting with Databases
Unit 3
Data Cleaning and Preparation
- Handling Missing Data
- Data Transformation
- Extension Data Types
- String Manipulation
- Categorical Data
Data Wrangling: Join, Combine, and Reshape
- Hierarchical Indexing
- Combining and Merging Datasets
- Reshaping and Pivoting
Plotting and Visualization
- Matplotlib API
- Plotting with pandas and seaborn
- Other Python Visualization Tools
Unit 4
Data Aggregation and Group Operations
- Let's Think About Group Operations
- Data Aggregation
- Apply: General split-apply-combine
Time Series
- Date and Time Data Types and Tools
- Time Series Basics
- Date Ranges, Frequencies, and Shifting
- Time Zone Handling
- Periods and Period Arithmetic
- Resampling and Frequency Conversion
- Moving Window Functions
Unit 5
Introduction to Modeling Libraries in Python
- Interfacing Between pandas and Model Code
- Creating Model Descriptions with Patsy
- Introduction to statsmodels
- Introduction to scikit-learn
Data Analysis Project
- Public Datasets
- Baby Names
- Election Commission Database
- USA.gov