0 Students Enrolled

Python Data Analysis

Python Data Analysis Course Master the Professional Data Science Stack in 40 Intensive Hours Pandas · NumPy · Matplotlib · Jupyter | Real-World…

  • 10 April 2026
Python for Data Analysis

About Course

Python Data Analysis Course

Master the Professional Data Science Stack in 40 Intensive Hours

Pandas · NumPy · Matplotlib · Jupyter | Real-World Data Problems | Intermediate Level

📄

Download Course Curriculum

Get the complete Python Data Analysis Course curriculum, including all five domains, learning outcomes, hands-on project descriptions, and career pathway information.

📥 Download PDF (Curriculum Details)

Ready to turn raw data into actionable insights? The Python Data Analysis Course is a practical, modern introduction to the professional data science stack. In just 40 hours, you'll master the tools and techniques used daily by data analysts and scientists at leading companies worldwide — Pandas, NumPy, Matplotlib, and Jupyter.

This course is ideal for two groups: Python programmers who want to break into data science, and analysts who already work with data but want to leverage Python's power. Importantly, we assume you can write Python code — this is not a Python basics course. Instead, we go straight into the data science tools that will transform how you work with information.

Hands-On from Day One: Rather than lecturing theory, we work through real-world data analysis problems from the very first session. You'll load, clean, reshape, and analyse actual datasets using the same workflows professional data analysts use every day. Consequently, you'll finish the course with practical skills you can apply immediately at work.

Industry-Standard Tools: The course is built around the Python data science stack used by professionals globally — Pandas for data manipulation, NumPy for numerical computing, Matplotlib for visualisation, and Jupyter for interactive, reproducible analysis. These are not academic tools; they're what the industry actually uses.

Thorough, Detailed Examples: Furthermore, we don't rush through topics with trivial examples. Every technique is demonstrated with thorough, detailed examples drawn from realistic scenarios — financial data, survey results, time series, scientific measurements. You'll understand not just what to do, but why it works.

From Manipulation to Visualisation: You'll cover the full data analysis workflow — from loading messy data and dealing with missing values, through joining and reshaping datasets, all the way to producing compelling visualisations and performing group-based aggregations. By the end, you'll be able to tackle complex data problems with confidence.

Get a Quote

Course Duration & Structure

  • Total Duration: 40 hours (Full-time intensive or part-time options available)
  • Format: Intensive, hands-on, project-based instruction with live expert guidance
  • Course Structure: Five teaching domains covering the complete data analysis workflow:
    • Domain 1: Python & Jupyter Foundations (8 hours)
    • Domain 2: Pandas — Data Structures & Operations (8 hours)
    • Domain 3: Loading, Storing & File Formats (8 hours)
    • Domain 4: Data Wrangling — Cleaning, Merging & Reshaping (8 hours)
    • Domain 5: Aggregation, Visualisation & Time Series (8 hours)

Pre-requisites

  • Solid Python programming foundation — at least equivalent to our Web Programming Bootcamp
  • Comfortable writing Python functions, loops, conditionals, and working with lists and dictionaries
  • Basic understanding of programming concepts and data structures
  • Comfortable working with the command line and a text editor or IDE

Learning Locations

  • Johannesburg: Hybrid (In-person and online classes)
  • Everywhere Else: Live Online (Full virtual classroom experience globally)

What Do I Need?

  • Laptop / PC: At least a modern Intel i5 with 8GB RAM minimum (Windows, Mac, or Linux)
  • Internet: Stable connection of at least 10 Mbps. For assignments at home, allow 50–100GB data per month
  • Software: All free — we guide you through installing Python, Jupyter, and all required libraries (Pandas, NumPy, Matplotlib) before day one

Certification

  • Attendance Certificate: Awarded upon attending 80% or more of sessions and completing all classwork
  • Competency Certificate: Awarded upon successfully completing all practical data analysis projects with instructor review and approval
  • Portfolio: A collection of completed data analysis notebooks and projects ready to present to employers
  • Lifetime Access: Access to all course materials and Jupyter notebooks is retained after the course completes

Price

  • R14,995
  • Flexible student loan financing options available

Skill Level

  • Intermediate — solid Python knowledge required

Who Will Benefit

  • Python developers wanting to transition into data science and analytics
  • Analysts and reporting professionals looking to leverage Python for data work
  • Web developers expanding into data-driven application development
  • Scientists and engineers who work with data and want a modern Python toolkit
  • Business intelligence professionals wanting to add Python skills
  • Graduates of our Web Programming Bootcamp ready to specialise in data

What You Will Learn

  • Master Pandas, NumPy, Matplotlib, and Jupyter for professional data workflows
  • Load and store data in CSV, Excel, JSON, HDF5, and SQL formats
  • Clean and prepare messy real-world data for analysis
  • Join, merge, concatenate, and reshape datasets effectively
  • Use groupBy for split-apply-combine data aggregations
  • Create informative visualisations with Matplotlib
  • Analyse and manipulate regular and irregular time series data
  • Apply NumPy for fast vectorised numerical operations
  • Use Jupyter notebooks for interactive, reproducible analysis
  • Solve real-world data problems with thorough, professional examples
  • Introduction to Python modelling libraries for data science

Complete Curriculum

40 hours of intensive, hands-on instruction covering the complete professional Python data analysis workflow — from setup and NumPy fundamentals through to visualisation, time series analysis, and an introduction to modelling.

Domain 1: Python & Jupyter Foundations (8 hours)

Build a solid base with NumPy and the Jupyter ecosystem before tackling Pandas

The Jupyter Environment

  • Jupyter Notebook and JupyterLab interface
  • IPython shell for exploratory computing
  • Magic commands and keyboard shortcuts
  • Organising and documenting analysis notebooks
  • Running Python scripts vs interactive notebooks

NumPy Fundamentals

  • ndarray — the core NumPy data structure
  • Creating arrays: zeros, ones, arange, linspace
  • Array indexing, slicing, and boolean selection
  • Array shapes, reshaping, and transposing
  • Vectorised operations and broadcasting
  • Universal functions (ufuncs)

Advanced NumPy

  • Fancy indexing and advanced selection
  • Sorting arrays and indirect sorts (argsort)
  • Statistical methods: mean, std, sum, cumsum
  • Linear algebra basics with NumPy
  • Random number generation with numpy.random
  • Array file input and output

Domain 2: Pandas — Data Structures & Operations (8 hours)

Master the two core Pandas structures and essential data manipulation operations

Series and DataFrame

  • Series — one-dimensional labelled array
  • DataFrame — two-dimensional tabular structure
  • Index objects and alignment behaviour
  • Creating DataFrames from dicts, lists, and arrays
  • Column selection, addition, and deletion
  • Reindexing and label alignment

Indexing, Selection & Filtering

  • iloc (integer-based) and loc (label-based) indexers
  • Boolean indexing and filtering rows
  • Selecting subsets of rows and columns
  • Duplicate labels and handling them
  • Hierarchical indexing (MultiIndex)
  • Stack and unstack operations

Essential Pandas Functionality

  • Arithmetic operations and data alignment
  • Applying functions with apply() and applymap()
  • Sorting by index and by values
  • Ranking with rank()
  • Summarising data: describe(), value_counts(), info()
  • Correlation and covariance

Domain 3: Loading, Storing & File Formats (8 hours)

Read and write data in every format professionals encounter

Text & Delimited Formats

  • Reading CSV and TSV files with read_csv()
  • Handling headers, custom delimiters, and encoding
  • Parsing dates and type inference
  • Handling large files with chunking
  • Writing to CSV and text files
  • Reading fixed-width format files

Binary & Structured Formats

  • Excel files — reading and writing with openpyxl
  • HDF5 format for high-performance storage
  • JSON data — reading from APIs and files
  • Python pickle format for serialisation
  • Feather and Parquet formats for data engineering

Databases & Web Data

  • Connecting to SQL databases with SQLAlchemy
  • Reading SQL query results into DataFrames
  • Writing DataFrames back to SQL tables
  • Fetching data from web APIs
  • Parsing HTML tables from web pages

Domain 4: Data Wrangling — Cleaning, Merging & Reshaping (8 hours)

Transform messy real-world data into clean, analysis-ready datasets

Handling Missing Data

  • Detecting missing values with isnull() and notnull()
  • Filtering out missing data with dropna()
  • Filling missing values with fillna()
  • Forward and backward filling strategies
  • Interpolation methods
  • Understanding NaN vs None vs pd.NA

Data Transformation

  • Removing duplicates with drop_duplicates()
  • Replacing values with map() and replace()
  • Renaming axes and columns
  • Discretisation and binning with cut() and qcut()
  • Detecting and filtering outliers
  • String manipulation with str accessor

Merging, Joining & Reshaping

  • Database-style joins with merge() — inner, left, right, outer
  • Concatenating DataFrames with concat()
  • Combining datasets with combine_first()
  • Reshaping with pivot() and pivot_table()
  • Melting from wide to long format with melt()
  • Stack and unstack for hierarchical reshaping

Domain 5: Aggregation, Visualisation & Time Series (8 hours)

Summarise, visualise, and analyse data — including time series and an introduction to modelling

GroupBy & Aggregation

  • GroupBy mechanics — split-apply-combine pattern
  • Iterating over groups
  • Applying multiple aggregation functions
  • Custom aggregation with agg()
  • Transformation with transform()
  • Pivot tables and cross-tabulations

Data Visualisation with Matplotlib

  • Matplotlib figure and axes architecture
  • Line charts, bar charts, and histograms
  • Scatter plots and pair plots
  • Customising titles, labels, and legends
  • Subplots and multi-panel figures
  • Plotting directly from Pandas DataFrames

Time Series Analysis

  • Date and time data types in Python
  • DatetimeIndex and time-based indexing
  • Date range generation with date_range()
  • Resampling and frequency conversion
  • Rolling and expanding window calculations
  • Handling time zones and irregular time series

Introduction to Modelling

  • Interfacing Pandas with Scikit-learn
  • Preparing DataFrames for machine learning pipelines
  • Introduction to statsmodels for statistical analysis
  • Patsy for model formulas
  • Where to go next: machine learning courses and resources

What You'll Master

NumPy & Numerical Computing

  • ndarray creation and manipulation
  • Vectorised operations for performance
  • Broadcasting and shape manipulation
  • Statistical functions and linear algebra
  • Random number generation for simulation

Pandas Data Manipulation

  • Series and DataFrame mastery
  • Flexible indexing and selection
  • Sorting, ranking, and summarising
  • Applying functions across rows and columns
  • Working with hierarchical (MultiIndex) data

Data Loading & Storage

  • CSV, Excel, JSON, HDF5, and SQL formats
  • Reading from web APIs and HTML tables
  • Writing clean data back to files and databases
  • Efficiently handling large datasets with chunking

Data Cleaning & Preparation

  • Detecting and handling missing values
  • Removing duplicates and fixing inconsistencies
  • String cleaning and text processing
  • Outlier detection and treatment
  • Discretisation and feature binning

Merging & Reshaping

  • Database-style joins (inner, left, right, outer)
  • Concatenating and combining datasets
  • Pivoting from long to wide format
  • Melting from wide to long format
  • Stack and unstack for hierarchical data

Aggregation & GroupBy

  • Split-apply-combine workflows
  • Custom aggregation functions
  • Pivot tables and cross-tabulations
  • Group-wise transformations
  • Summarising large datasets efficiently

Data Visualisation

  • Matplotlib figure and axes control
  • Line, bar, scatter, and histogram charts
  • Multi-panel subplot layouts
  • Plotting directly from Pandas
  • Professional chart formatting and labelling

Time Series & Modelling

  • DatetimeIndex and time-based operations
  • Resampling and rolling window calculations
  • Handling irregular and timezone-aware data
  • Preparing data for Scikit-learn pipelines
  • Introduction to statistical modelling with statsmodels

Your Career Path After This Course

Completing the Python Data Analysis Course opens doors to high-demand roles in data science, analytics, and business intelligence. Python data skills are among the most sought-after in the technology industry — and this course gives you the practical foundation employers are looking for.

Career Opportunities

  • Data Analyst: Clean, analyse, and present data insights to business stakeholders
  • Junior Data Scientist: Build analysis pipelines and predictive models using Python
  • Business Analyst: Use Python to automate reporting and analyse business performance
  • Data Engineer: Build and maintain data pipelines using Pandas and Python tools
  • Analytics Developer: Develop data-driven applications and dashboards
  • Research Analyst: Apply Python data tools to scientific or academic research

Salary Potential in South Africa

  • Junior Data Analyst: R250,000 – R380,000 per year
  • Data Analyst (mid-level): R380,000 – R600,000 per year
  • Senior Data Analyst: R600,000 – R900,000+ per year
  • Data Scientist: R700,000 – R1,200,000+ per year
  • Salaries vary by industry, company size, and experience level

Next Steps at Code College

  • Many students progress to our Machine Learning with Python course after this foundation
  • Others combine data skills with our Full-Stack Web Developer Bootcamp to build data-driven applications
  • Advanced students pursue specialisations in deep learning, NLP, or data engineering
  • This course also serves as excellent preparation for industry certifications in data science and analytics

Frequently Asked Questions

What is data analysis with Python?

Python data analysis involves using libraries like Pandas, NumPy, and Matplotlib to load, clean, transform, and interpret datasets. Python has become the leading language for data science because it combines ease of use with an incredibly powerful ecosystem of data tools — allowing analysts to automate workflows that would take hours in Excel, and handle datasets far too large for spreadsheet software.

Why should I learn Python for data analysis?

Python is the number one language used by professional data analysts and data scientists worldwide. Its ecosystem — Pandas, NumPy, Matplotlib, Scikit-learn — covers everything from cleaning data to building machine learning models. It's used by banks, tech companies, research institutions, and government agencies. Learning Python for data analysis opens doors to well-paid, high-demand careers globally.

Do I need prior Python experience?

Yes — this course requires solid Python programming knowledge. You should be comfortable with functions, loops, conditionals, lists, and dictionaries. We recommend completing our Web Programming Bootcamp or equivalent beforehand. This course does not teach Python basics; it teaches you to apply Python to data science problems.

What tools and libraries will I learn?

You'll master the professional Python data science stack: NumPy for fast numerical computing, Pandas for data manipulation and analysis, Matplotlib for data visualisation, and Jupyter for interactive notebooks. You'll also get an introduction to Scikit-learn and statsmodels for modelling, and tools like SQLAlchemy for database connectivity.

What are the prerequisites?

  • Completion of — or equivalent experience to — our Web Programming Bootcamp
  • Solid Python programming fundamentals (functions, loops, lists, dictionaries)
  • Comfortable working with the command line or terminal
  • A modern laptop with at least 8GB RAM
  • Stable internet connection (10 Mbps minimum)

Is this a hands-on course?

Absolutely. This is a practical, project-based course built around real datasets and real analysis problems. Every domain includes hands-on exercises in Jupyter notebooks. You'll clean actual messy data, build visualisations from real datasets, and produce complete analysis notebooks you can keep and present as portfolio work.

What learning materials are provided?

You'll receive comprehensive Jupyter notebooks with all examples, curated datasets for hands-on practice, electronic course notes, and access to a structured learning portal. All materials remain accessible to you after the course — there's no expiry on your access.

Can I access course content after completion?

Yes. Your access to all course materials, notebooks, and datasets remains active for as long as you need them. We'll check with you periodically before removing access, so you won't lose your materials unexpectedly.

Will I receive a certificate?

  • Attendance Certificate: Awarded upon attending 80% or more of sessions and completing all classwork
  • Competency Certificate: Awarded upon successfully completing all practical data analysis projects with instructor sign-off

How is this course different from a general Python course?

This course is specifically focused on data analysis — not general-purpose Python development. We go deep on Pandas, NumPy, Matplotlib, and Jupyter from a data analyst's perspective. Every example, every exercise, and every project is grounded in real data analysis scenarios. If you already know Python and want data skills, this is the direct path.

What career opportunities follow this course?

  • Data Analyst
  • Junior Data Scientist
  • Business Analyst
  • Data Engineer
  • Analytics Developer
  • Research Analyst
  • Further study in machine learning, deep learning, and advanced data science

What is the course price and are there payment options?

The course is priced at R14,995. Flexible student loan financing options are available — contact us at info@codecollege.co.za or WhatsApp +27 83 600 2765 for details.

Pricing

Invest in a high-demand data science skill set

Python Data Analysis Course

R14,995

  • Duration: 40 hours (Full-time intensive or part-time)
  • Skill Level: Intermediate (Python experience required)
  • Format: Hands-on, project-based, live instruction
  • Locations: Johannesburg (Hybrid) | Live Online (Global)
  • Language: English
  • Updated: January 2026
  • Financing: Flexible student loan options available
Apply Now!

Course Schedule & Calendar

View upcoming Python Data Analysis Course dates below. New classes start regularly — both full-time intensive and part-time options are listed.

Can't find a date that works? Contact us to discuss custom scheduling options or to be added to our waitlist for upcoming sessions.

Ready to Master Data Analysis with Python?

Join Python developers and analysts who have transformed their capabilities — and their careers — with professional data science skills.

Apply Now
Show More

Who will benefit

  • Learn  manipulating, processing, cleaning, and crunching datasets in Python.
  • Use practical case studies that show you how to solve a broad set of data analysis problems effectively.
  • Learn the latest versions of Pandas, NumPy, and Jupyter in the process.
  • Use the Jupyter notebook and the IPython shell for exploratory computing
  • Learn basic and advanced features in NumPy
  • Get started with data analysis tools in the pandas library
  • Use flexible tools to load, clean, transform, merge, and reshape data
  • Create informative visualizations with matplotlib
  • Apply the pandas groupBy facility to slice, dice, and summarize datasets
  • Analyze and manipulate regular and irregular time series data
  • Learn how to solve real-world data analysis problems with thorough, detailed examples

Course Content

R14,995.00
30-Day Money-Back Guarantee
  • Update:10 April 2026
  • Skill LevelIntermediate
  • LanguageEnglish
  • Course Duration: 40h

Target Audience

  • Beginner Python Programmers that want to learn about Data Analysis
Show More
Python Data Analysis
R14,995.00
Hi, Welcome back!
Forgot Password?
SORT By Rating
SORT By Order
SORT By Author
SORT By Price
SORT By Category