SQL Notebooks in Databricks

August 11, 2022

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3208-programming-on-azure-databricks-with-pyspark-sql-and-scala. In this tutorial, you will learn how to create and use SQL Notebooks in Databricks…

Implement an AWS Lambda using .NET

March 29, 2022

In this tutorial, you will create an AWS Lambda that exposes an ASP.NET service. The tutorial will follow these steps: Install and configure required tools…

Learning the CoLab Jupyter Notebook Environment

March 29, 2022

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3174-pragmatic-python-programming. Google Colaboratory (CoLab) is a free Jupyter notebook interactive development environment (REPL) hosted in Google’s…

Robust Python Programming Techniques

March 29, 2022

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3174-pragmatic-python-programming. 1.1 Defining Robust Programming We will define Robust Programming as a collection of assorted programming…

Future Trends in Data Science and Data Engineering

February 7, 2022

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3169-data-science-and-data-engineering-in-2022. 1.1 Big Trends in 2021 2021 was a very incremental year in terms of breakthroughs,…

Defining Data Science for Architects

December 30, 2021

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 What is Data Science? Data science focuses on the extraction of knowledge and business…

Introduction to Pandas for Architects

December 29, 2021

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 What is pandas? pandas (https://pandas.pydata.org/) is an open-source library that provides high-performance, memory-efficient, easy-to-use…

Data Visualization in Python for Architects

December 29, 2021

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3057-data-science-and-data-engineering-for-architects. 1.1 Why Do I Need Data Visualization? The common wisdom states that: Seeing is believing…

An AWS CLI / Node.js Script for Terminating EC2 Instances

June 6, 2017

The AWS Command Line Interface (CLI) is a powerful scripting platform written in Python that uses the AWS Cloud’s RESTful management API for performing various…

Using k-means Machine Learning Algorithm with Apache Spark and R

January 31, 2017

In this post, I will demonstrate the usage of the k-means clustering algorithm in R and in Apache Spark.Apache Spark (hereinafter Spark) offers two implementations…