This tutorial is adapted from Web Age course Applied Data Science with Python.
This tutorial provides quick overview of Python modules and highpower features, NumPy library, pandas library, SciPy library, scikitlearn library, Jupyter notebooks and Anaconda distribution.
1.1 Importing Modules
Command

Comments 
from lib import x
from lib import x as y

Importing a single object from a module(lib) allows direct referencing of the object, e.g
x() or y() Introduces potential variable name collision with other module imports 
from lib import * 
Not clean — potential variable name clobbering 
import lib as alias  You import the whole module giving it an alias; use the ‘.’ dot prefix to access objects in it:
alias.x() That’s the preferred way to import modules 
1.2 Listing Methods in a Module
dir(module_alias), e.g. dir(np) or dir('module_name'), e.g. dir('sys')
1.3 Creating Your Own Modules
# my_utils.py def foo(a, b): return a + b
├── my_library # directory │ ├── __init__.py # empty file, without it, Python will not allow import from a directory │ ├── my_utils.py # your module (def …)
from my_library import my_utils my_utils.foo(4, 2)
def foo(a, b): return a + b if __name__ == "__main__": # Application entry point is here
1.4 Random Numbers
import random # can alias the import random.random() # a float in a range of [0, 1) random.randint(x,y) # an integer in a range of [x,y]
Notes:
Python uses the Mersenne Twister as the pseudorandom number generator. It produces 53bit precision floats and has a generation cycle of 2**199371 (the cycle is 6002 digits long). The underlying implementation in C, which is both fast and threadsafe.
1.5 Zipping Lists
a = [1,2,3,4,5] b = [10,20,30,40,50] [str(x) + ':' + str(y) for x, y in zip(a,b)] Output: ['1:10', '2:20', '3:30', '4:40', '5:50']
1.6 List Comprehension
y = 10 [ i**2 + y for i in range (1, 20) if (i % 2 == 0) ] Output: [14, 26, 46, 74, 110, 154, 206, 266, 334]
1.7 Python Data ScienceCentric Libraries
1.8 NumPy
1.9 NumPy Arrays
import numpy as np
# Aliasing it as np – the accepted convention
# Simple arrays:
a1 = np.array ([1,2,3,4,5])
# Takes in a Python list as input and outputs a numpy.ndarray object
a2 = np.arange(5)
# A numpy.ndarray object holding [0, 1, 2, 3, 4]
# A matrix structure (note the nested brackets)
m = np.array ([[1,2,3,4], [5,6,7,8]])
m.shape
Output: (2, 4)
1.10 Select NumPy Operations
10 * np.arange(5) + np.arange(5) Output: array([ 0, 11, 22, 33, 44]) np.cos( np.pi * np.array([0,1,2,3])) Output: array([ 1., 1., 1., 1.])
a = np.array ([1,2,3,4,5]) a [a % 2 == 0] Output: array([2, 4])
m = np.array ([1,2,3,4,5,6,7,8]).reshape(4,2) Output: a 4x2 matrix
1.11 SciPy
Notes:
The SciPy module includes the following subpackages:
constants: physical constants and conversion factors (since version 0.7.0[5]) cluster: hierarchical clustering, vector quantization, Kmeans fftpack: Discrete Fourier Transform algorithms integrate: numerical integration routines interpolate: interpolation tools io: data input and output lib: Python wrappers to external libraries linalg: linear algebra routines misc: miscellaneous utilities (e.g. image reading/writing) ndimage: various functions for multidimensional image processing optimize: optimization algorithms including linear programming signal: signal processing tools sparse: sparse matrix and related algorithms spatial: KDtrees, nearest neighbors, distance functions special: special functions stats: statistical functions weave: tool for writing C/C++ code as Python multiline strings
1.12 pandas
1.13 Creating a pandas DataFrame
import pandas as pd import numpy as np # You build a DataFrame as a matrix (array of arrays) m = np.array ([1,2,3,4,5,6,7,8]).reshape(4,2) df = pd.DataFrame(m, columns = ["Col1", "Col2"]) # Now you have this structure: Col1 Col2 0 1 2 1 3 4 2 5 6 3 7 8
1.14 Fetching and Sorting Data
df.Col1 # or df['Col1'] # Col1 values with the their indexes as the leftmost column Output: 0 1 1 3 2 5 3 7 Name: Col1, dtype: int32 df.iloc[1, :] # Getting the second row via its index Output: Col1 3 Col2 4 Name: 1, dtype: int32 df.Col2.sort_values(ascending=False) # Sorting values in descending order (ascending order is default) Output: 3 8 2 6 1 4 0 2 Name: Col2, dtype: int32
1.15 Scikitlearn
1.16 Matplotlib
1.17 Python Dev Tools and REPLs
1.18 IPython
1.19 Jupyter
jupyter notebook
Notes:
The name Jupyter is an indirect acronym of the three core languages it was designed for: JUlia, PYThon, and R.
1.20 Jupyter Operation Modes
1.21 Jupyter Common Commands
Notes:
You can preview and edit the command shortcuts by navigating to Help > Edit Keyboard Shortcuts using the menu bar. Unfortunately, for now, Jupyter does not support macros / scripting.
1.22 Anaconda
Notes:
The conda package manager supports the following commands:
clean  Remove unused packages and caches 
config  Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user .condarc file (C:\Users\Mikhail\.condarc) by default 
create 
Create a new conda environment from a list of specified packages

help  Displays a list of available conda commands and their help strings 
info  Display information about current conda install 
install  Installs a list of packages into a specified conda environment 
list  List linked packages in a conda environment 
package  Lowlevel conda package utility. (EXPERIMENTAL) 
remove  Remove a list of packages from a specified conda environment 
uninstall  Alias for conda remove. See conda remove –help. 
search  Search for packages and display associated information. The input is a MatchSpec, a query language for conda packages. 
update  Updates conda packages to the latest compatible version. This command accepts a list of package names and updates them to the latest versions that are compatible with all other packages in the environment. Conda attempts to install the newest versions of the requested packages. To accomplish this, it may update some packages that are already installed, or install additional packages. To prevent existing packages from updating, use the –noupdatedeps option. This may force conda to install older versions of the requested packages, and it does not prevent additional dependency packages from being installed. If you wish to skip dependency checking altogether, use the ‘–force’ option. This may result in an environment with incompatible packages, so this option must be used with great caution 
upgrade  Alias for conda update. See conda update –help 