Machine Learning in Python
This four-day course teaches you how to use Python for modern data analysis and machine learning, progressing from core programming and data manipulation to statistical modelling and predictive analytics.
Prerequisites
Some familiarity with programming concepts (in any language) is assumed.
Expected Outcomes
By the end of the course, you will be able to use Python confidently for processing, analysing, modelling, and visualising real-world data, with particular emphasis on time-series and structured datasets. You will have hands-on experience using Python for scripting, data manipulation, and visualisation across a range of common data sources, including CSV files, Excel spreadsheets, SQL databases, JSON data, and REST API endpoints.
You will have built and applied predictive models using widely used machine-learning techniques such as regression, classification, and clustering, and learned how to evaluate and select models using appropriate validation and diagnostic tools. You will understand how Python’s data-science ecosystem fits together in practice, and will be well placed to continue developing and applying these skills in your day-to-day analytical, quantitative, or machine-learning work.
Course Syllabus
Session 1: Python Basics
Day 1 introduces Python through hands-on examples, focusing on Python's core language features and data types needed to write clear, reliable scripts and small automation tools.
- Why Python? What’s possible?
- The Jupyter notebook for rapid prototyping
- Modules and packages
- Core Python concepts, introduced through examples
- Essential data types: strings, tuples, lists, dictionaries
- Raising and handling exceptions
- Worked example: retrieving real-time data from a REST web API
Session 2: Handling, Analyzing, and Presenting Data in Python
Python offers amazingly productive tools like Polars for working with different kinds of data. Day 2 gives a thorough introduction to analyzing and visualizing data easily:
- Reading and writing essential data formats: CSV, Excel, SQL, time-series (others on request)
- Selecting and filtering data in Polars
- Data fusion: joining datasets
- Aggregation with “group by” operations; pivot tables
- Visualization and statistical graphics with Plotly Express
- Preview: turning Python analysis into interactive dashboards with Streamlit
Session 3: Further Data Analytics
Day 3 shows you in-depth how to manipulate time-series and matrix/vector data. It then gives examples of Monte Carlo simulation, interpolation, linear regression, and outlier / anomaly detection:
- Introduction to NumPy for manipulating vector and matrix data: data types, powerful indexing, reshaping, ufuncs
- Monte Carlo simulation and applications
- Linear regression
- Outlier and anomaly detection with pyod; applications to time-series
- Clustering with scikit-learn, with applications
Session 4: Machine Learning
Day 4 gives you a practical and comprehensive introduction to machine learning for powerfully inferring complex models from data, with examples selected from a range of industries, including time-series and spatial datasets:
- Intuition behind ML; overview of the ML package ecosystem in Python
- Nonlinear regression; application to time-series forecasting
- Classification; application to diagnosis, AI systems, satellite imagery, ...
- Validation and model selection; diagnostic tools; yellowbrick
- Feature engineering and selection
- Deploying machine learning models in production
Personal help
We are happy to offer on-the-spot problem-solving after each day of the training for you to ask one-on-one questions — whether about the course content and exercises or about specific problems you face in your work and how to solve them. If you would like us to prepare for this in advance, you are welcome to send us background info before the course.
Other information
-
Format:
-
Courses are conducted online via video meeting using Python Charmers' cloud notebook server for sharing code with the trainer(s).
-
Computer:
Hardware: we recommend ≥ 8 GB of RAM and a webcam. Preferably also multiple screens and a quiet room (or headset mic).
Software: a modern browser: Chrome, Firefox, or Safari (not IE or Edge); and Zoom.
Coding: we have a cloud-based coding server that supports running code and sharing code with the trainer(s).
-
Timing:
Most courses will run from 9:00 to roughly 17:00 (AEST/AEDT) each day, with breaks of 50 minutes for lunch and 20 minutes each for morning and afternoon tea.
Certificate of completion:
We will provide you a certificate if you complete the course and successfully answer the majority of the exercise questions.
Materials:
You will have access to all the course materials via the cloud server. We will also send you a bound copy of the course notes, cheat sheets, and a USB stick containing the materials, exercise solutions, and further resources.
