An AWS CLI / Node.js Script for Terminating EC2 Instances

The AWS Command Line Interface (CLI) is a powerful scripting platform written in Python that uses the AWS Cloud’s RESTful management API for performing various operational tasks, like creating S3 buckets, deleting EBS volumes, etc.

In this blog, I will show you how you can terminate EC2 instances from your local computer using AWS CLI wrapped up as a Node.js app.

What you need is these four things:
Read the rest of this entry »

No Comments

Top 5 IT Skills Needed for Digital Transformation

Top 5 IT Skills Needed for Digital Transformation


Share this Image On Your Site

Front End Developers with skills for the mobile age

Our Mobile Web Development academy can teach developers to embrace mobile development skills:

  • Responsive Web Development
  • HTML5, CSS3 AngularJS and jQuery
  • iOS and Android programming


Agile Development Skills will continue to be integral to digital transformation.

Digital transformation needs an agile development team who can adapt, deliver, and diversify.

Key roles include:

  • Scrum Masters who act as facilitator for an agile development team to help create plan cohesion and drive production forward.
  • Product Owners who provide the goal for the scrum team, defining the finishing line for each project.
  • Coaches who help development teams become agile by working with them to implement agile principles.


Solution Architects who find technical solutions for business problems.

Key skills include:

  • Full stack knowledge to give solutions architects a more thorough understanding of the possible ways to solve problems.
  • Cloud knowledge as an integral path to finding solutions.
  • MicroServices developers who can create lightweight solutions for modular services.

Machine Learning and Data Analytics Skills to power new technology

Machine learning and data analytics are finding new applications such as the technology behind self-driving cars and online shopping recommendations.


DevOps Engineers to help drive digital transformation within a business

Engineers must have a unique skillset to claim the title DevOps:

  • Incredible communication skills
  • An understanding of the needs of ALL stakeholders in a business
  • A drive to meet business goals
  • Process re-engineering experience
  • Coding and scripting knowledge
  • An understanding of agile and lean principles

No Comments

Configuration Management – Just Do It!

Configuration Management, as applied in DevOps, is the practice of using tools to manage the configuration of our technical architecture.  Put simply, we document the desired state of one or more servers in a machine-readable form, and then use a configuration management tool (e.g. Chef, Puppet, Salt, or Ansible) to setup the real items to match the configuration.

Why?  As with many things, you might think “I barely have time to setup this server manually, never mind write a script for it”.  But the reality is, in today’s world, we’re probably going to have to do the same setup multiple times.  In enterprise environments, the software engineering process will require deployment to several environments (development, integration test, user acceptance test, performance test, then finally production).  It’s critical that all these environments are identical (so far as is practical).  This is a key tenet of DevOps operating practices – we can’t apply learnings from one environment to others if the environments are different.  Even in a simpler environment, you’re probably going to replace your servers in a fairly short time frame.

Once upon a time, we used to buy server hardware and nurse it along for years.  Nowadays, hardware and system software ends up obsolete in a matter of months, not years.  If you’ve done your CM work properly, it’s faster to deploy to a fresh OS image than to patch an existing image.  And if you’ve embraced virtualization or a cloud environment, you’ll find it’s easier to replace a server than to troubleshoot it.

There’s another advantage to Configuration Management:  The system state is documented and version controlled.  There’s no need to do archaeological work to find out what your technical architecture has evolved to – you just look at the configuration scripts.

So here’s your challenge for the next year: Stop managing your servers by hand.  Setup an open-source Configuration Management system and create a configuration repository (I’m partial to Ansible.  See for some tips to get started).  Every time you find yourself ssh-ing or logging in to a server, stop yourself and create a playbook, recipe, or run-list in your CM repository.   Make it a habit to use manual tools only to examine the server state, not to change anything.  Run your servers the way developers write code:  If it isn’t in version control, it didn’t happen!

For that matter, do the same thing on your laptop or workstation.  You’ll soon find that you’re much less stressed about maintaining your configuration or upgrading your hardware.

No Comments

Using k-means Machine Learning Algorithm with Apache Spark and R

In this post, I will demonstrate the usage of the k-means clustering algorithm in R and in Apache Spark.Apache Spark (hereinafter Spark) offers two implementations of k-means algorithm: one is packaged with its MLlib library; the other one exists in Spark’s package. While both implementations are currently more or less functionally equivalent, the Spark ML team recommends using the package by showcasing its support for pipeline processing (inspired by the scikit-learn Machine Learning library in Python) and its versatile DataFrame data structure (probably inspired by R’s DataFrame matrix-like structure similar to tables in relational databases, also wildly popular in Python’s pandas library.)

The k-means clustering is an example of an unsupervised ML algorithm where you are only required to give a hint to the computer as to how many clusters (classes of objects) you expect to be present in your data set. The algorithm will go ahead and use your data as the training data set to build a model and try to figure out the boundaries of those clusters. Then you can proceed to the classification phase with your test data.

With k-means, you, essentially, have your computer (or a cluster of computers) perform a partitioning of your data into Voronoi cells where the cells represent the identified clusters.
Read the rest of this entry »

No Comments

Angular 2 Property and Event Bindings

Have you noticed that many of the directives built-in to AngularJS are missing in Angular 2.0? Well there two reasons for that; “property bindings” and “event bindings”. The various binding types in Angular 2 remove the need for many of the directives built into the prior version of the framework.
Here are a few examples of Angular 2.0 property bindings:
AngularJS Directive
Angular 2 Property Binding
Hide/unhide an element
ng-hide = “expression”
Disable an element (ie Button)
ng-disabled = “expression”
Set href for an anchor tag
ng-href = “expression”
Set src for an image tag
ng-src = “expression”
[src]= “expression”
Here “expression” can take a number of forms:
Example Expression
Refers to:
A component property named myVariable
“2 + 2”,
“myVar * 3”,
An expression that will be evaluated by Angular
A call to a method in the component
In addition to property bindings Angular 2 includes event binding syntax which also replaces various AngularJS directives. Here are some examples:
AngularJS Directive
Angular 2 Property Binding
Bind code to a click event
ng-click = “expression”
(click) = “expression”
Bind code to input keyup event
ng-keyup = “expression”
(keyup) = “expression”
Bind code to mouseover event
ng-mouseover = “expression”
(mouseover) = “expression”
Bind code to submit event
ng-submit = “expression”
(submit) = “expression”
Here an “expression” is typically either an angular expression or a component method call.
The addition of property and event binding syntax in Angular 2 opens up binding to all DOM properties and events and dramatically reduces the number of built-in directives the Angular development team needs to maintain.
For more information on property binding see:
For more information on event binding see:

No Comments

Spark RDD Performance Improvement Techniques (Post 2 of 2)

In this post we will review the more important aspects related to RDD checkpointing. We will continue working on the over500 RDD we created in the previous post on caching.

You will remember that checkpointing is a process of truncating an RDD’s lineage graph and saving its materialized version on a persistence store.
Read the rest of this entry »

No Comments

Spark RDD Performance Improvement Techniques (Post 1 of 2)

Spark offers developers two simple and quite efficient techniques to improve RDD performance and operations against them: caching and checkpointing.

Caching allows you to save a materialized RDD in memory, which greatly improves iterative or multi-pass operations that need to traverse the same data set over and over again (e.g. in machine learning algorithms.)

Read the rest of this entry »

No Comments

Apache Spark class development complete

Last week I completed development of our 2 day class teaching Apache Spark which will be integrated in our Big Data and Data Science classes after the QA cycle.
I will be feeding some fragments of the material with additional comments and notes that would help you get a taste of what the new content is all about and see if it can help you in your work.
Stay tuned!

No Comments

SparkR on CDH and HDP

Spark added support for R back in version 1.4.1. and you can use it in Spark Standalone mode.

Big Hadoop distros, like Cloudera’s CDH and Hortonworks’ HDP that bundle Spark, have varying degree of support for R. For the time being, CDH decided to opt out of supporting R (their latest CDH 5.8.x version does not even have sparkR binaries), while HDP (versions 2.3.2, 2.4, … ) includes SparkR as a technical preview technology and bundles some R-related components, like the sparkR script. Making it all work (if at all this is presently possible) is another story and making it run on YARN may be a whole novel of a size of War and Peace.  So you can view this more as a demonstration of Hortonworks’ commitment to Spark, and we are left with the original supported language triad: Scala, Python, and Java.

No Comments

Spring Boot Training Available

The Spring framework has been a highly popular framework for Java applications.  So popular in fact, it is pretty much the defacto standard of Java application frameworks.  One of the issues many projects run into though is just getting started with a new Spring project with all of the different features and configuration that might be required of a Spring project.

This is where the Spring Boot project can help.  Spring Boot makes it easy to create production-grade Spring applications that “just run”.  The following features taken from the Spring Boot site make it easy to get a Spring project going and focus more on the “What is this project supposed to do?” instead of the “How do we get this project setup?”

  • Create stand-alone Spring applications
  • Embed Tomcat, Jetty or Undertow directly (no need to deploy WAR files)
  • Provide opinionated ‘starter’ POMs to simplify your Maven configuration
  • Automatically configure Spring whenever possible
  • Get out of the way quickly as requirements start to diverge from the defaults
  • Provide production-ready features such as metrics, health checks and externalized configuration
  • Absolutely no code generation and no requirement for XML configuration

Since we have had many clients asking about Spring Boot recently we have added a 2-day Spring Boot training course that can help you start learning how to use this very useful project.  You can find the outline here:

WA2511 Spring Boot Training

Spring Boot is definitely one of the newer Spring features that prove this application framework isn’t going away anytime soon, it continues to grow!


No Comments