Home > Resources > Blog

Robust Python Programming Techniques

March 29, 2022 by Mikhail Vladimirov
Category: Data Science and Business Analytics

This tutorial is adapted from the Web Age course https://www.webagesolutions.com/courses/WA3174-pragmatic-python-programming.

1.1 Defining Robust Programming

We will define Robust Programming as a collection of assorted programming techniques, methods, practices, and libraries that can help you write programs with the following properties:
- more compliant with the software specs,
- producing the predictable outcomes,
- more resistant to (unavoidable) human errors, software defects, and hardware failures, and
- more readily lending themselves to troubleshooting
Some of these techniques fall under the category of defensive programming, which is (at least, in spirit) close to defensive driving practices
Other techniques aim at aiding you in better productionalizing your software and helping you with an evidence-based postmortem analysis in case of program faults

1.2 Assertions

Assertions is a mechanism that helps software developers write a more robust code
They are part and parcel of the Defensive Programming philosophy
Assertions are boolean expressions that ensure that a specific (often critical) assumption about the program’s state is not violated
Program execution proceeds if the assertion evaluates to True (the program is in normal state); an exception with a supplied error message is raised otherwise
- Assertions are usually used to validate input parameters passed to a function, an object method, or to verify some fundamental business conditions

1.3 The assert Expression in Python

Python assertions are programmed with the assert statement:

assert <context-specific boolean expression>, <error message>

Example:

assert age > 0, 'Age must be a positive number over 0'

In the above example, an AssertionError exception will be raised for any age values below zero

1.4 What is Unit Testing and Why Should I Care?

Unit testing focuses on testing individual units of code or single components for compliance with software specifications before proceeding to integration testing
Commonly, units under testing are functions and/or classes
Testing of code is performed using assertions
Suites (blocks or collections) of unit tests are used in testing larger codebase like that of libraries and modules
Unit testing is usually automated and made an integral part of the continuous software delivery process

Notes:

Testing multiple components working together is known as integration testing.

1.5 Unit Testing and Test-driven Development

Unit testing underpins the test-driven development (TDD) practice, which is a software delivery process of converting software requirements into a series of verifiable fine-grained tests that must be passed by the new code
- TDD promotes the idea of writing unit tests before the actual new code (to be verified) has been written
- TDD has its roots in the Agile manifesto and Extreme programming

1.6 TDD Benefits

Forces developers to write code that better matches software requirements
Reduces chances for software defects
Facilitates adding new functionality and regression testing
Acts as a form of evidence-based documentation
Improves code maintenance
Increases confidence in software to be delivered

1.7 Unit Testing in Python

Python’s unit testing framework is backed by the unittest module (https://docs.python.org/3/library/unittest.html)
- Inspired by Java’s JUnit framework
- Integrated into a standard Python library since version 2.1
The main features include:
- Test automation,
- Sharing of setup and shutdown code for tests,
- Aggregation of tests into collections (suites),
- Independence of the unit tests from the test-run reporting framework
Note: There are other unit test frameworks you can use in Python, e.g. pytest that is a third-party unit test framework with a simpler syntax for writing tests

Notes:

Unit tests concepts (from unittest’s documentation):

test fixture – A test fixture represents the preparation needed to perform one or more tests, and any associated cleanup actions. This may involve, for example, creating temporary or proxy databases, directories, or starting a server process.

test case – A test case is the individual unit of testing. It checks for a specific response to a particular set of inputs. unittest provides a base class, TestCase, which may be used to create new test cases.

test suite – A test suite is a collection of test cases, test suites, or both. It is used to aggregate tests that should be executed together.

test runner – A test runner is a component which orchestrates the execution of tests and provides the outcome to the user. The runner may use a graphical interface, a textual interface, or return a special value to indicate the results of executing the tests.

1.8 Steps for Creating a Unit Test in Python

Create a Python file
In the file, import the unittest standard library
Create a class, e.g. MyUnitTest, that inherits from the unittest.TestCase class like so:

class MyUnitTest(unittest.TestCase):

Add methods to the MyUnitTest class that contain assertions of some assumptions as per software specs; methods names must be prefixed with a test_ string and take a single parameter — a reference to the class itself (self):

def test_transaction_reversal(self):
# Add one or more valid assertion functions

Note: Some assertion-like functions used in the Unit Test class are listed in the slide’s notes
Add the command-line entry point with the unittest.main() call

Notes:

The unittest main assertions:

assertEqual(first, second, msg=None)
assertNotEqual(first, second, msg=None)
assertTrue(expr, msg=None)
assertFalse(expr, msg=None)
assertIs(first, second, msg=None)
assertIsNot(first, second, msg=None)
assertIsNone(expr, msg=None)
assertIsNotNone(expr, msg=None)
assertIn(member, container, msg=None)
assertNotIn(member, container, msg=None)
assertRaises(exception, callable, *args, **kwds)
assertRaises(exception, *, msg=None)
assertGreater(first, second, msg=None)
assertGreaterEqual(first, second, msg=None)
assertLess(first, second, msg=None)
assertLessEqual(first, second, msg=None)
...

1.9 Running the Unit Tests

Unit tests are run from the command line as regular Python files:

python my_unit_tests.py

When you run your Unit Test Python file, the progress messages are printed to the console reporting any failed assertions in test methods, e.g.:

F
======================================================================
FAIL: test_concat (__main__.SimpleUnitTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".\unit_test.py", line 14, in test_concat
    self.assertEqual(len(concat(str1, str2)), 1 + len(str1) + len(str2))
AssertionError: 10 != 11

Successful tests are passed silently
The total number of tests, including the failed ones, is reported at the end of the process, e.g.:

Ran 5 tests in 0.002s
FAILED (failures=2)

1.10 A Unit Test Example

import unittest
# Two functions to be unit tested
def concat(str1, str2):
    return str1 + str2
def absolute_val(number):
    return abs(number)
class SimpleUnitTest(unittest.TestCase):
    def test_concat(self):
        str1 = "some"
        str2 = "string"
        self.assertEqual(len(concat(str1, str2)), 
			   len(str1) + len(str2))
    def test_absolute_val(self):
        for n in [-2,-1, 0, 1, 2]:            
            self.assertGreaterEqual(absolute_val(n), n, 
"The absolute value should be >= the value itself.")
if __name__ == '__main__':
    unittest.main()

Notes:

Unit testing Python classes involves using the testing harness (test fixtures) in the form of the setUp(self) and (optionally) tearDown(self) Unit Test methods as shown in the code snippet below:

import unittest
class MyWidgetTestCase(unittest.TestCase):
    def setUp(self):
        self.widget = Widget('The widget to be unit tested')
# test methods for the Widget's method go here, e.g
    def test_widget_base_price(self):
        self.assertEqual(self.widget.base_price(), 99.99,
                         'Wrong widget base price!')

Note:

The setUp() and tearDown() methods are called once per test run.

1.11 Errors

There are two types of Python errors:
- Syntax Errors
  - Detected by your IDE/ REPL during the code parsing phase (before the program gets executed)
- Runtime Errors
  - Raised by the Python runtime (VM) during code execution (when the program is running)
    - Uncaught (not properly handled) exceptions cause Python to program to stop
    - Proper handling of runtime errors is considered an essential coding practice going a long way toward making your programs robust and resilient to unexpected runtime situations
  - For the list of built-in run-time errors, visit http://bit.ly/PY_ERR
- Yes, you can create and throw (raise) your own exceptions like so:
  - raise Exception(‘You error message as text‘)

Notes:

The class hierarchy for built-in exceptions:

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StopAsyncIteration
      +-- ArithmeticError
      |    +-- FloatingPointError
      |    +-- OverflowError
      |    +-- ZeroDivisionError
      +-- AssertionError
      +-- AttributeError
      +-- BufferError
      +-- EOFError
      +-- ImportError
      |    +-- ModuleNotFoundError
      +-- LookupError
      |    +-- IndexError
      |    +-- KeyError
      +-- MemoryError
      +-- NameError
      |    +-- UnboundLocalError
      +-- OSError
      |    +-- BlockingIOError
      |    +-- ChildProcessError
      |    +-- ConnectionError
      |    |    +-- BrokenPipeError
      |    |    +-- ConnectionAbortedError
      |    |    +-- ConnectionRefusedError
      |    |    +-- ConnectionResetError
      |    +-- FileExistsError
      |    +-- FileNotFoundError
      |    +-- InterruptedError
      |    +-- IsADirectoryError
      |    +-- NotADirectoryError
      |    +-- PermissionError
      |    +-- ProcessLookupError
      |    +-- TimeoutError
      +-- ReferenceError
      +-- RuntimeError
      |    +-- NotImplementedError
      |    +-- RecursionError
      +-- SyntaxError
      |    +-- IndentationError
      |         +-- TabError
      +-- SystemError
      +-- TypeError
      +-- ValueError
      |    +-- UnicodeError
      |         +-- UnicodeDecodeError
      |         +-- UnicodeEncodeError
      |         +-- UnicodeTranslateError

1.12 The try-except-finally Construct

Catching and handling runtime errors in Python is done using the try-except[-finally] construct, which is similar to analogous constructs found in other languages
Example of handling a (division by zero) runtime error

    try:
#A naive way to write off a credit card debt ...
        10000 / 0
    except Exception as e:
      print('Exception: ', e) 
    finally:
        print ('Continuing, for better or for worse ...')

In the above except code block, e (can be named as you see fit) is an instance of the built-in exception ZeroDivisionError:

	isinstance(e, ZeroDivisionError) # True

The finally part of the try-except-finally construct is optional

Notes:

The most simple try-except construct (which should be avoided as it “swallows” the error) looks as follows:

try:
    ....
except:
   print('Something went crazy wrong ...')

If you want to indicate the success in the try-except construct, use else, like so:

try:
    ....
except:
   print('Processing failed ...')
else:
   print ('Processing succeeded!')

1.13 What’s Wrong with this Error-Handling Code?

   
def scale_it(original_size, scale):
  new_size = -1
  try:
    new_size = original_size / scale
  except Exception as e:
    pass 
  return new_size

Notes:

Even though the code shown in the slide illustrates a situation where an error is simply swallowed without giving the developer any feedback as to why the scale_it() function in some cases returns a nonsensical -1.

Here is an extract from the Python source code that employs the same “bad” idiom, which, however, makes total sense as it simply tries to dynamically plug in suitable functionality:

try:
    # OpenSSL's scrypt requires OpenSSL 1.1+
    from _hashlib import scrypt
except ImportError:
    pass

1.14 Life after an Exception

So, you’ve got an error, caught an exception, but you need to recover and continue, right? Here are a few how-to suggestions:
- Break your potentially “monolithic” code into specialized and narrow-focus functions is the first step in the right direction
- If a function’s purpose is to carry out some data processing and not return any results back to the function’s caller, the error message should add some business context to the error message to help with assessing the overall impact of the error (working on this error-handling aspect will generally help you write better code as you get a deeper insight into the business/processing logic
- Errors in production-ready applications must be logged — not just printed to console; logging is discussed a bit later
- If a function returns some value, decide on the return value signaling an error inside the function; the common practice is to use None

Notes:

None gets evaluated to False, so your function caller code can handle the None return value as follows:

result = some_function()
if result:
    print ('Processing ', result)
else:
    print ('Nothing to process due to a previous error.')

or, depending on your coding preferences:

if result is None:
    print ('Nothing to process due to a previous error.')
else:
     print ('Processing ', result)

In some cases, you may want to delegate error processing logic to the caller (in code below func_with_exception() does not handle exceptions internally) :

def caller ():
    result = 'some default result value'
    try:
        result = func_with_exception() # may throw an exception 
    except Exception as ex:    
        print (f'Got an error: {ex}')
    # At this point, result will either hold the value returned by the func_with_exception() function -- the Happy Path execution flow -- or the 'some default result value' default value.

1.15 Assertions vs Errors (Exceptions)

An assertion is a statement that ensures the correctness of any developer assumptions made while writing the program (e.g. verification of input parameters, fundamental business logic assumptions, etc.)
You can disable all the assertion statements in your program by using the -O command-line flag, like so:

$ python -O my_program.py

All the assertions in my_program.py will be ignored (treated as a No-Op bytecode that does nothing) and skipped
Exceptions are runtime errors that may be caused by a software defect, hardware/network problem, insufficient space to allocate a new object on the heap, etc.

1.16 What is Logging and Why Should I Care?

Logging is a process of recording runtime events happening in the program during its execution and its runtime state
Logging events are commonly published as text messages sent to any of the following logging destinations:
- The console, a local file, over the network to a remote database or a messaging system, etc.
Logs (the persisted blocks of log messages) are critical for regulatory compliance, monitoring the health of long-running processes, troubleshooting and conducting an exhaustive postmortem analysis of failed programs, as a source for (real-time) data analytics, etc.

1.17 A Simple Print Statement vs Logging

In its most basic form, logging can be done using the print statement that prints a message to the console; this “poor man’s logging” is commonly used for rudimentary debugging and program flow tracing needs

print (f'The current value of x is {x}')

Production-level logging requires the use of specialized logging libraries, e.g. the logging module in Python, that enable developers to control a variety of logging aspects such as:
- Different logging destinations, like file, console, remote SQL database, etc.,
- Log message timestamps,
- Different logging levels,
- Runtime context,
- etc.

1.18 Logging Levels

Events (log messages) being logged have a varying degree of importance from the developer’s (or program’s) perspective
In Python’s logging module, the level of importance is expressed as a numeric value shown in the table below (higher numbers indicate higher weight or importance given to a message logged at that level)

You can look up the numeric value associated with a particular logging level using the logging level name attributes, e.g.: logging.DEBUG, logging.INFO, etc.
Developers can define their own logging levels, which need to have their own associated numeric values

Notes:

The design of the logging module in Python mostly borrows from the ideas implemented in the Log4j logging level system used in Python.

1.19 The Logger Hierarchy

By default, the Python logging system provides you with the default logger (the object that performs logging) called root
You can create your own loggers as instances of the Logger class and instantiate them using the logging.getLogger(<your logger name>) factory method, and then configuring the loggers per your needs
The logger hierarchy is encoded in logger names using their dot-separated names designating logger positions in the hierarchy, e.g. “a”, “a.b”, “a.b.c”, or “who”, “who.let”, “who.let.the”, “who.let.the.dogs”, “who.let.the.dogs.out”, etc.
- Loggers that are further down in the hierarchical list are children (descendants) of loggers higher up in the list, e.g. a logger with the name ‘a.b.c’ is a descendant of loggers ‘a’ and ‘a.b’
- All user-created loggers are descendants of the root default logger
- A logger hierarchy helps with establishing the base configuration settings and further customizing logging messages down the hierarchy
In-depth discussion of loggers and other logging capabilities is beyond the scope of this module

Notes:

For a good overview of the Python logging system, visit https://coralogix.com/blog/python-logging-best-practices-tips/

For a cook-book approach to standing up logging, see https://docs.python.org/3/howto/logging-cookbook.html#logging-cookbook

1.20 The Logging Levels

Python documentation details logging levels as follows (in increasing order of severity):

DEBUG - Detailed information, typically of interest only when diagnosing problems; messages at this level are logged with the logging.debug() method.
INFO - Confirmation that things are working as expected; messages at this level are logged with the logging.info() method.
WARNING (The default) - An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected. Messages at this level are logged with the logging.warning() method.
ERROR - Due to a more serious problem, the software has not been able to perform some function; messages at this level are logged with the logging.error() method.
CRITICAL - A serious error, indicating that the program itself may be unable to continue running; messages at this level are logged with the logging.critical() method.

1.21 Setting the Logging Level

Messages (events) of any level below the currently configured logging level will be ignored, for example:
- Setting the logging level at ERROR will only allow log messages of the ERROR and CRITICAL type to be registered with the logging system
You set the logging level of the default logger (root) in your program using the logging.basicConfig() function:

import logging
logging.basicConfig(level=logging.INFO)

You can set the logging level for your program with the –log command-line flag, e.g. — log=DEBUG

Notes:

Once invoked, the basicConfig() function sets the logging configuration for the duration of the running program (you simply won’t be able to change the logging configuration by calling basicConfig() again with a different set of configuration parameters).

1.22 Configuring Logging Messages

The default logging message format is rather uninformative:

<LOGGINGLEVEL>:root:<Your logged message>

Configuring the default root logger is done (as mentioned earlier) using the logging.basicConfig() function, which enables you to do the following:
- Format the text of your log messages using the format named parameter, e.g.:

format='%(levelname)s:%(asctime)s -- %(message)s'

Designate the log file as the destination for your log messages using the filename named parameter
And much more … Some of the basicConfig configuration parameters are listed in the slide’s notes

Notes:

While it is possible (and shown in the slides of this module) to programmatically configure the logging message details, the best practice is to use configuration files; you can learn more about logging configuration here: https://docs.python.org/3/library/logging.config.html#module-logging.config

Basic Configuration Supported Keyword Arguments:

Source: https://docs.python.org/3/library/logging.html#logging.basicConfig

The LogRecord object configuration parameters.

The LogRecord object represents the log record and allows you to programmatically configure the attributes of the logging message (https://docs.python.org/3/library/logging.html#logrecord-attributes)

Some of the more important log record configuration parameters are:

lineno – Source line number where the logging call was issued (if available)

msecs – Millisecond portion of the time when the LogRecord was created.

Example of a more advanced log record configuration:

log_format =  \
  '%(levelname)s:%(asctime)s.%(msecs)d -%(filename)s:%(lineno)d -- %(message)s'
logging.basicConfig(format = log_format, datefmt='%Y-%m-%d %H:%M:%S', \ 
						level = logging.INFO)

1.23 Example of Using Logging

Code below performs logging to the console:

import logging
logging.error('This error message will be printed')
logging.warning('This warning message will be printed')
logging.info('Info-level messages are ignored by default') 
logging.debug('Debug-level messages are ignored by default')

1.24 Logging in Python: December 9, 2021 Update

The design of Python’s logging module (https://docs.python.org/3/library/logging.html) is modeled after the popular Java-based Apache Log4j OSS logging library
Note: Critical vulnerabilities of Java’s Log4j library were announced December 9, 2021, some of which may impact Python’s logging as well (particular Python-based web frameworks)
- Review all the aspects of the known vulnerabilities and apply the security patches available before considering using the Python logging module

1.25 Hands-On Activities

Complete activities listed in the Loggingpart of the Robust Programming lab.

1.26 Summary

In this tutorial, we covered a number of techniques and methods that can help you write programs that are:
- Robust,
- Compliant with the software specs,
- Predictable at runtime and lending themselves to root cause analysis of failures
The topics we discussed included:
- Making assertions,
- Unit testing,
- Error handling, and
- Logging