Python for Data Science
This chapter provides a quick overview of:
• Python modules and high-power features
• NumPy library
• pandas library
• SciPy library
• scikit-learn library
• Jupyter notebooks
• Anaconda distribution
Webinar
Applied Data Science using Python
Courses
WA2715 Applied Data Science with Python
This intensive training course provides theoretical and practical aspects of using Python in the realm of Data Science, Business Analytics, and Data Logistics. The coverage of the related core concepts, terminology, and theory is provided as well. This training course is supplemented by a variety of hands-on labs (the list of which is provided at the bottom of this outline) that help attendees reinforce their theoretical knowledge of the learned material.
TOPICS
• Applied Data Science and Business Analytics
• Common Data Science algorithms for supervised and unsupervised machine learning
• NumPy, pandas, Matplotlib, scikit-learn
• Python REPLs
• Jupyter notebooks
• Data analytics life-cycle phases
• Data repairing and normalizing
• Data aggregation and grouping
• Data visualization
Data Science and ML Algorithms in scikit-learn
In this chapter, participants will learn about some of the algorithms and common analytical methods used in Data Science and Machine Learning (ML), including:
• Terminology
• Dimensionality reduction
• k-Nearest Neighbors
• Decision Trees
• Support Vector Machines (SVMs)
• Naive Bayes Classifier
• Cluster Analysis with k-Means
• Regression Analysis
• Time-Series Analysis
What, if any, data science or machine learning initiatives have been undertaken in your organization?
• Could you share some of the successes / failures?
• What are some of the insights you would like to share with the class?
Types of Machine Learning
• There are three main types of machine learning (ML):
• unsupervised learning
• supervised learning, and
• reinforcement learning
• We will be dealing only with the unsupervised and supervised learning types
• Just FYI: The goal of reinforcement learning is to instruct computer-based algorithms to select actions that maximize a domain-specific gain or minimize a cost (which, essentially, emulates the way humans learn)
Cloudera and Hortonworks announce merger
Cloudera and Hortonworks announced their merger late last week. This merger combines the two top providers of Hadoop to the enterprise customers. With the pressure from the major cloud providers providing big data solutions, their market was shrinking and this allows the companies to combine their strengths.
At Web Age Solutions, we have been providing Hadoop training on both platforms to Fortune 500 customers for the last five years. Of course we will now be looking at consolidating these courses.
WA2341 Hadoop Programming on the Cloudera Platform
WA2622 Hadoop Programming on the Hortonworks Data Platform for Managers