SQL Notebooks in Databricks
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3208-programming-on-azure-databricks-with-pyspark-sql-and-scala. In this tutorial, you will learn how to create and use SQL Notebooks in Databricks…
Implement an AWS Lambda using .NET
In this tutorial, you will create an AWS Lambda that exposes an ASP.NET service. The tutorial will follow these steps: Install and configure required tools…
Learning the CoLab Jupyter Notebook Environment
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3174-pragmatic-python-programming. Google Colaboratory (CoLab) is a free Jupyter notebook interactive development environment (REPL) hosted in Google’s…
Robust Python Programming Techniques
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3174-pragmatic-python-programming. 1.1 Defining Robust Programming We will define Robust Programming as a collection of assorted programming…
Future Trends in Data Science and Data Engineering
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3169-data-science-and-data-engineering-in-2022. 1.1 Big Trends in 2021 2021 was a very incremental year in terms of breakthroughs,…
Defining Data Science for Architects
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 What is Data Science? Data science focuses on the extraction of knowledge and business…
Introduction to Pandas for Architects
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 What is pandas? pandas (https://pandas.pydata.org/) is an open-source library that provides high-performance, memory-efficient, easy-to-use…
Data Visualization in Python for Architects
This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 Why Do I Need Data Visualization? The common wisdom states that: Seeing is believing…
An AWS CLI / Node.js Script for Terminating EC2 Instances
The AWS Command Line Interface (CLI) is a powerful scripting platform written in Python that uses the AWS Cloud’s RESTful management API for performing various…
Using k-means Machine Learning Algorithm with Apache Spark and R
In this post, I will demonstrate the usage of the k-means clustering algorithm in R and in Apache Spark.Apache Spark (hereinafter Spark) offers two implementations…