05/08/2023 - 05/10/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $2,090.00
Enroll
06/26/2023 - 06/28/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $2,090.00
Enroll

Despite the fact that efficient numerical computing is not Python's strong point, this programming language has effectively become the lingua franca of data engineering and machine learning projects due to its expressiveness and developer-friendliness.


Two Python libraries make it a popular choice for data scientists, machine learning practitioners, data engineers, and data analysts – NumPy and pandas.  While the former lends C-level computing efficiencies, the latter, by leveraging the capabilities of NumPy where possible, offers an instantly productive table-oriented data processing and analysis experience.


Mastering NumPy and pandas, however, requires time and dedication; in some situations, the learning experience might get a bit frustrating and this comprehensive NumPy and pandas hands-on training course aims at helping the aspiring data practitioners in their efforts to learn these libraries.

Objectives

•    Unlock the power and efficiencies of NumPy and pandas
•    Learn the core and advanced features of both Python libraries
•    Become comfortable in navigating the related APIs

Topics

  • NumPy
    o    NumPy efficiencies
    o    Reshaping and flattening ndarrays
    o    Axis-aware functionality 
    o    Vectorization and broadcasting
    o    Iterators
    o    Random number generations
    o    Statistical distribution functionality
    o    Linear algebra functions

  • pandas
    o    Series and DataFrame APIs
    o    Accessing data
    o    Filtering data
    o    Data aggregation operations
    o    Data descriptive statistics
    o    Dealing with missing data
    o    Fine-tuning column data types
    o    Pivot tables and crosstabs
    o    Working with time series

Audience

Data Practitioners, Business Analysts, Software Engineers, and IT Architects

Prerequisites

Participants should have a working knowledge of Python (take a moment to review our Python 3 course) and be familiar with core statistical concepts (variance, correlation, etc.)

Duration

3 days

Outline for Comprehensive NumPy and Pandas Training

Chapter 1 - Comprehensive NumPy

  • Doing the Labs and Hands-on Exercises
  • NumPy
  • The Python and C Connection
  • NumPy Characteristics
  • NumPy Efficiencies
  • The ndarray Object vs Python Sequence
  • The ndarray Data Structure Visually
  • The First Take on NumPy Arrays and the array() Method
  • Getting Help
  • The np.info() Function
  • The arange() Method
  • Hands-on Exercises
  • Re-Shaping, Take 1
  • Re-Shaping with Order
  • "Smart" Reshaping
  • Hands-on Exercises
  • Array Slicing
  • Array Slicing Visually
  • 2-D Array Slicing
  • Slicing and Stepping Through
  • Getting Last Row and Last Column
  • Indexing with Arrays of Indices
  • Hands-on Exercises
  • Understanding NumPy Types
  • Commonly Used Platform-Portable ndarray Numeric Data Types
  • Other Data Types
  • Unicode Strings
  • Example of a Boolean Array
  • Changing the Data Type using astype()
  • Hands-on Exercises
  • Commonly Used Array Metrics
  • Example of Getting Common Array Metrics
  • What is An ndarray Axis?
  • Commonly Used Aggregate (Reduction) Functions
  • Axis-Aware Aggregate Functions Visually
  • The NaN Value
  • The nan_to_num() Function
  • NaN in Aggregate Functions
  • The NaN-Tolerant Functions
  • The inf Value
  • The inf-Related Functions
  • Checking for Valid Numbers in an ndarray
  • Hands-on Exercises
  • The newaxis Attribute
  • Flattening the Matrices
  • The ravel() Method
  • Changing Order When Flattening with ravel()
  • ravel(): Things to be Aware of ...
  • The flatten() Method
  • Flattening with reshape(-1)
  • Flattening Using the [:,-1] Operator
  • Hands-on Exercises
  • Understaning Little-Endian and Big-Endian Byte Encodings
  • Handling Little-Endian and Big-Endian Byte Encodings in NumPy
  • Creating "Dummy" Arrays
  • "Dummy" Arrays Visually
  • The "Dummy-Like" Arrays
  • Hands-on Exercises
  • Generating Data Points with linspace()
  • Building Coordinate Matrices with meshgrid()
  • The view() Function
  • The copy() Function
  • The Issue of Shallow Copies of Python Lists
  • The True "Deep Copy"
  • Vectorization
  • Vectorization Visually
  • Broadcasting
  • Broadcasting Visually
  • Hands-on Exercises
  • Array Arithmetic Operations
  • Filtering
  • Hands-on Exercises
  • The any() and all() Functions
  • Combining Arrays
  • Examples of Combining Arrays
  • The append() Function
  • Hands-on Exercises
  • The insert() Function
  • The delete() Function
  • Hands-on Exercises
  • I/O Operations
  • Examples of I/O Operations
  • I/O Operations Considerations
  • Memory-Mapped Files
  • Hands-on Exercises
  • Using unique() and repeat()
  • Sundry Functions
  • Support for Generating Random Numbers
  • Seeding
  • The NumPy Random Generator's Methods
  • Generating Random Numbers
  • Distributions
  • Drawing Samples from the Poisson Distribution
  • The histogram() Function
  • Example of Using histogram()
  • Descriptive Statistics
  • Hands-on Exercises
  • Sorting Arrays
  • Sorting Examples
  • Understanding argsort()
  • The argmin() and argmax() Functions
  • Hands-on Exercises
  • The vectorize() Function
  • The Iterator Object
  • Example of Using NumPy Iterator
  • The Linear Algebra Functions
  • Matrix Operations
  • Matrix Operations (Cont'd)
  • The Norm Concept
  • Calculating the L2 Norm
  • Hands-on Exercises
  • Summary

Chapter 2 - Comprehensive pandas

  • Doing the Labs and Hands-on Exercises
  • What is pandas?
  • The Main Features and Capabilities
  • The Core High-Level Data Structures
  • The Series Object
  • Understanding the View and Copy Aspects of the Input Data
  • Example of a Series Object
  • Accessing Values and Indexes in the Series Object
  • The Index Property
  • Using the Series Index as a Lookup Key
  • Useful Series Methods
  • The Series Object Supports NumPy Array Operations
  • Can I Pack a Python Dictionary into a Series?
  • Hands-on Exercises
  • The DataFrame Object
  • The DataFrame's Value Proposition
  • Creating a DataFrame
  • Example of Creating a pandas DataFrame from a NumPy Array
  • Creating a pandas DataFrame from a Python Dictionary
  • Plugging In Your Own Index
  • Example of Using Your Own Index
  • Getting DataFrame Metrics
  • Creating a Column with Auto-Incremented Values
  • The DataFrame info() Method
  • The describe() Method
  • Example of a describe() Output when called on a DataFrame
  • Example of a describe() Output when called on a Series
  • Accessing DataFrame Columns
  • Accessing DataFrame Rows
  • Renaming DataFrame Columns
  • Hands-on Exercises
  • Accessing DataFrame Cells
  • The iloc[] Property
  • Examples of Using the iloc() DataFrame Method
  • Using a Function in iloc
  • The Type of Object iloc Returns
  • The loc[] Property
  • Examples of Using loc[]
  • Hands-on Exercises
  • Filtering in DataFrames
  • Examples of DataFrame Filtering
  • Using any() and all() with loc[]
  • The filter() Method
  • DataFrames are Mutable via Object Reference!
  • Iterating over DataFrame's Contents
  • Example of Iterating over DataFrame's Contents
  • Hands-on Exercises
  • The Axes
  • Deleting Rows and Columns
  • More on the drop() DataFrame Method
  • Examples of Using the drop() Method
  • Adding a New Column to a DataFrame
  • Appending/Concatenating DataFrame and Series Objects
  • The concat() Method
  • Using the concat() Method
  • Reindexing
  • Re-indexing Series and DataFrames
  • Joining DataFrames
  • Understanding the get_dummies() DataFrame Method
  • Example of Using the get_dummies() Method
  • Hands-on Exercises
  • What are Descriptive Statistics?
  • Calculating Descriptive Statistics and Summary Measures in pandas
  • Calculations Along axes
  • Examples of Axis-Specific DataFrame Operations
  • The nlargest() and nsmallest() Methods
  • Hands-on Exercises
  • Dealing with Missing Data
  • Getting Information About the Missing Data (NaN)
  • Dropping Rows/Column with NaNs
  • The dropna() Function
  • Examples of Using dropna()
  • Interpolating Missing Data in pandas
  • Examples of Interpolating Missing Data
  • The fillna() Method
  • Examples of Using fillna()
  • Dropping Duplicate Rows
  • The apply() Function
  • Example of Using the apply() Function
  • Hands-on Exercises
  • Sorting DataFrame Values
  • Hands-on Exercises
  • The pandas I/O: Reading Methods
  • Reading From CSV Files
  • The pandas I/O: Writing Methods
  • Writing to a CSV File
  • Writing to the System Clipboard
  • Hands-on Exercises
  • Minimizing DataFrames' Memory Footprint
  • The Default Type Inferences
  • Fine-Tuning Column Data Types
  • What May Go Wrong with Converting Numbers
  • Data Aggregation and Grouping in pandas
  • Sample Data Set
  • The pandas.core.groupby.SeriesGroupBy Object
  • Grouping by Two or More Columns
  • Emulating the SQL WHERE Clause
  • The Pivot Tables
  • Another Example of Data Pivoting
  • Cross-Tabulation
  • The cumsum() Method
  • Hierarchical Indexing and MultiIndex Object
  • Examples of Creating and Using a Hierarchical Index (MultiIndex)
  • Data Aggregation Using a MultiIndex
  • Hands-on Exercises
  • Time Series Defined
  • Handling Time Series Data
  • Handling Time Series in pandas
  • Example of Converting Text Timestamps into Datetime Objects
  • Converting a Text Column Representing Dates
  • Using the datetime Object as a DataFrame Index
  • Generating Date Ranges
  • Example of Using date_range()
  • Hands-on Exercises
  • Data Visualization
  • Summary

Lab Exercises

Lab 1. Learning the CoLab Jupyter Notebook Environment
Lab 2. Comprehensive NumPy
Lab 3. Comprehensive pandas