How to Use Python’s Functional Programming Capabilities?

This tutorial is adapted from the Web Age course Practical Python 3 Programming.

In this tutorial, you will learn how to use Python’s functional programming capabilities. 

Part 1 – Create a New Python 3 Jupyter Notebook

1. Launch Jupyter Notebook 

2. Create a new Python 3 notebook and name it Functional Programming

Part 2 – Understanding the map Function

First things first, let’s define our data model – in our case, it will be a simple shortlist of integers.

1. Enter the following command:

data_model = [1,2,3,4,5,6]

Assume that we have a requirement to scale (multiply) every element of our list by 10.

Before we show how to use Python’s built-in map function, let’s see the functionally equivalent code that uses the traditional looping and list comprehension idioms.

2. Enter the following code for multiplying list elements by 10 using traditional looping:

map_loop_result = []
scale = 10
for i in data_model:
    map_loop_result.append ( i * scale)
print (map_loop_result)

You should see the following output:

[10, 20, 30, 40, 50, 60]

An important thing to have in mind here is that the data we process must fit in machine memory and that processing is done on a single thread of execution.

3. Enter the following code for the same operation as above using the list comprehension idiom:

map_result_lc = [x * scale for x in data_model]; map_result_lc

You should see the same output:

[10, 20, 30, 40, 50, 60]

The same crucial constraint applies here as with the looping code above: data processing happens in the memory of a single machine on a single thread of execution.

Now time to see the map function in action.

Data processing using the map function involves uniformly applying a specific function to each element of the data model – you kind of touch all the elements with a magic wand applying the same action such that it maps your existing data model into the intended one. The biggest advantage is that applying the map function can be distributed across thousands of computers holding billions of elements.

4. Enter the following command:

list (map (lambda x: x * scale, data_model))

You should see the same output as we got previously.

We need to wrap the output of the map function in a list to make it usable as map in Python 3 creates an iterator; in Python 2, map already generated the output as a list.

We could have defined the lambda (anonymous) function outside of map to increase code readability, like so:

ml = lambda x: x * scale
list(map (ml, data_model)) 

Lambdas do not have names (lambda is the name) and their return value is the result of the evaluation of the expression contained in its body.

The map function only takes the lambda and the dataset to work on. Some languages make map a method of some objectified data structure (e.g. JavaScript’s Array object exposes the map method making code simpler). The map function’s implementation can be specific to the environment – in distributed computing, there may be multiple discrete map processes scheduled to run on a cluster applying the same lambda function to parts of the dataset that is evenly split across the cluster in some fashion where the map processes can get them (that’s what happens in the Hadoop data processing model).

The possibility of going massively parallel is a huge benefit of functional programming where the same code can be scaled to run on thousands of computers processing Big Data.

Part 3 – Filtering

Data filtering is one of the most often used operations in the data processing. Filtering in Python can also be efficiently done using the functional programming paradigm supported by the filter built-in global function.

Let’s say we have a requirement to go through the data model and retain only even numbers discarding the odd ones.

Again, we start with the functionally equivalent code that uses traditional looping and list comprehension idioms.

1. Enter the following code that is based on the traditional looping idiom:

filter_loop_result = []
for i in data_model:
    if i % 2 == 0:
        filter_loop_result.append(i)
print (filter_loop_result)

You should see the following output:

[2, 4, 6]

And again, an important thing to have in mind here is that all the data must squarely fit in machine memory and processing is done on a single thread of execution.

2. Enter the following code for the same operation as above using the list comprehension idiom:

filter_result_lc = [i for i in data_model if i % 2 == 0]
filter_result_lc

You should see the same output:

[2, 4, 6]

3. And here is the functionally equivalent functional programming Python filtering idiom using the filter function:

lada = lambda x: x % 2 == 0
list (filter(lada, data_model))

Which, when run, will produce the same list of only even numbers in the data model.

Note: You could also inline the lambda function in the filter function, like so:

list (filter(lambda x: x % 2 == 0, data_model))

Part 4 – The reduce Function

Reduction (a.k.a. folding) is an aggregation action which is about computing some scalar value from the data at hand, like sum, minimum, maximum, or count. The reduce Python function, which acts in this role, takes two parameters.

Note: The data reduction operation depends on the commutative and associative properties of the data items which make this operation possible in situations where the order of data items accessed by the reduce function cannot be guaranteed.

Mathematically these properties can be formulated as follows:

commutative: x + y == y + x

associative: (x + y) + z = x + (y + z)

Imagine that you have received a not-so-challenging requirement to find the sum of elements in the data model.

As usual, we will start with the traditional way to solve the problem.

1. Enter the following code:

sum_elem = 0
for x in data_model:
    sum_elem += x
    
print (sum_elem)    

You should see the scalar value of 21 (you will recall, that our data model is [1,2,3,4,5,6])

Note: You could also use the built-in sum function: sum(data_model).

And now it is time for the reduce() function to enter the scene.

2. Enter the following code:

from functools import reduce
reduce(lambda x,y: x + y, data_model)

Note: As of Python 3, the reduce function was moved to the functools module that we need to import here (the map and filter functions somehow managed to stay global).

You get back the expected 21 as the sum of all the data model’s elements.

To see how reduce pulls the data model’s elements, let’s create a regular Python function that will allow us to print the incoming data and use it instead of the lambda function. This is possible as long as your regular function honors the protocol between the reduce function and its client: the expected signature (there should be two input parameters), the processing logic, and the return value.

Here is how you can write this function.

3. Enter the following code:

def rd (x,y):
    print (x,y)
    return x + y
reduce(rd, data_model)

When you run the code in the cell, you should get the following messages printed:

1 2
3 3
6 4
10 5
15 6

21

Now let’s find the maximum element in our data model.

Python offers the built-in max function that will do this job in one shot: max(data_model).

4. Enter the following code:

max_reduce = lambda x,y: x if x > y else y
reduce(max_reduce, data_model)

You should get 6.

We are using the ternary conditional expression that we can inline in the body of the lambda function.

Part 5 – Sorting

In this lab part, we will contrast the outcomes of the .sort() List’s method and the sorted() function that can, optionally, take a lambda for applying some processing logic.

The biggest difference between the two is that .sort() performs sorting in-place, likely causing reshuffling of the elements in the source list, while the sorted() built-in function keeps the original list intact.

1. Enter the following command:

list_of_words = "A quick brownish fox jumps on the lazy dog".split() 

The above command will create a list of words from the above silly statement.

2. Enter the following command:

sorted_in_place = list_of_words
sorted_in_place.sort(); sorted_in_place

You should see the following output:

['A', 'brownish', 'dog', 'fox', 'jumps', 'lazy', 'on', 'quick', 'the']

Now, let’s print the list_of_words list off of which we created the sorted_in_place list to be sorted in place.

It was mutated as well! This is caused by the fact that sorted_in_place is just an alias to the list_of_words. This “feature” may easily ruin a day of a novice programmer, and not only in Python …

The solution to avoid this unfortunate tight coupling between the variables is to use List’s .copy() method, like so:

sorted_in_place = list_of_words.copy()
# now perform sorted_in_place.sort() sorting ...

Make it a rule to avoid object in-place mutation. This memory-saving feature is found in many languages as a way to conserve machine memory; now, in the post-COBOL era, you can unshackle yourself from the Y2K syndrome.

3. Repeat the previously run command to restore the original list:

list_of_words = "A quick brownish fox jumps on the lazy dog".split()

4. Enter the following command:

sorted(list_of_words)

This command will create a sorted list without affecting the source list.

You can reverse sorting to make it descending with this command:

sorted(list_of_words, reverse=True)

Now, it is time for lambda.

Let’s see how we can sort the list by the length of its elements so that you get the following list:

['A', 'on', 'fox', 'the', 'dog', 'lazy', 'quick', 'jumps', 'brownish']

5. The above order can be achieved with the following command:

sorted (list_of_words, key = lambda w: len(w))

The help page for sorted() is shown below.

Signature: sorted(iterable, /, *, key=None, reverse=False)
Docstring:
Return a new list containing all items from the iterable in ascending order.

A custom key function can be supplied to customize the sort order, and the reverse flag can be set to request the result in descending order. 

Part 8 – Review

In this tutorial, you learned about Python’s built-in functional programming capabilities.

Leave a Reply

Your email address will not be published. Required fields are marked *