Python Data Analysis Training Course

Learn the fundamentals of Data Analysis with Python, and how to apply it practically

Python Data Analysis Training Course


Course Overview

Duration

  • 10 Half-day lectures or 5 Full Days

What do I need?

  • Webinar : A laptop, and a stable internet connection. The recommended minimum speed is around 10 Mbps.
  • Classroom Training : A laptop, please notify us if you are not bringing your own laptop. Please see the calendar below for the schedule

Certification

  • Attendance : If you have attended 80% of the sessions and completed all the class work, you qualify for the Attendance Certificate. (Course Price : R14 995 based on minimum of 4 students)
  • Competency : Only offered as part of a Coding Bootcamp

Pre-requisites

Delivery

  • In-Person (Woodmead Classroom)
  • Remote (Discord Webinar)

Price

  • R14 500

Who will benefit

  • Students who have just mastered Pyton and want to explore the area of Data Analysis.
  • Management / Statisticians that want to get empowered by being able to learn how to use data productively

What you will learn

  • Understand the Python for Dada Analysis Eco System, Communities, Tools and Libraries
  • Understand IPython and Jupyter programming environments
  • An Overview of the Python Language
  • How to use NumPy and array-oriented computing in Python.
  • How to read and write datasets with Pandas.
  • How to visualise data with Pandas
  • How to get access to data
  • How to clean and prep data
  • How to do Data Wrangling (Join, Combine, Reshape)
  • How to visualise data with Oandas, Matplotlib and Seaborn
  • How to do Data Agregation and Group Operations
  • How to usse different types of analysis and data transformation tools to work with Time series
  • How to use modelling toolkits like statsmodels and scikit-learn
  • How to do a real-world data analysis project


Course Details

Unit 1

Preliminaries

  • Course Overview
  • Why Python for Data Analysis?
  • Essential Python Libraries

Python Language Basics, IPython, and Jupyter Notebooks

  • The Python Interpreter
  • IPython Basics
  • Python Language Basics

Built-In Data Structures, Functions, and Files

  • Data Structures and Sequences
  • Functions
  • Files and the Operating System

Unit 2

NumPy Basics: Arrays and Vectorized Computation

  • The NumPy ndarray: A Multidimensional Array Object
  • Pseudorandom Number Generation
  • Universal Functions: Fast Element-Wise Array Functions
  • Array-Oriented Programming with Arrays
  • File Input and Output with Arrays
  • Linear Algebra
  • Random Walks

Getting Started with Pandas

  • Introduction to pandas Data Structures
  • Essential Functionality
  • Summarizing and Computing Descriptive Statistics

Data Loading, Storage, and File Formats

  • Reading and Writing Data in Text Format
  • Binary Data Formats
  • Interacting with Web APIs
  • Interacting with Databases

Unit 3

Data Cleaning and Preparation

  • Handling Missing Data
  • Data Transformation
  • Extension Data Types
  • String Manipulation
  • Categorical Data

Data Wrangling: Join, Combine, and Reshape

  • Hierarchical Indexing
  • Combining and Merging Datasets
  • Reshaping and Pivoting

Plotting and Visualization

  • Matplotlib API
  • Plotting with pandas and seaborn
  • Other Python Visualization Tools

Unit 4

Data Aggregation and Group Operations

  • Let's Think About Group Operations
  • Data Aggregation
  • Apply: General split-apply-combine

Time Series

  • Date and Time Data Types and Tools
  • Time Series Basics
  • Date Ranges, Frequencies, and Shifting
  • Time Zone Handling
  • Periods and Period Arithmetic
  • Resampling and Frequency Conversion
  • Moving Window Functions

Unit 5

Introduction to Modeling Libraries in Python

  • Interfacing Between pandas and Model Code
  • Creating Model Descriptions with Patsy
  • Introduction to statsmodels
  • Introduction to scikit-learn

Data Analysis Project

  • Public Datasets
  • Baby Names
  • Election Commission Database
  • USA.gov

Calendar