Tips for writing numerical code in Python 3
Bayes Server has an advanced library (API) for Bayesian networks which can be called by many different languages including Python.
In this article we list some tips for writing numerical code in Python. The tips are not specific to the Bayes Server API, but apply to writing Python code in general. In fact, many of the tips apply to other languages also.
Some tips are accompanied by Python source code, which can easily be run in a Python 3 environment such as Jupyter notebooks, Google Colab, or a Python IDE such as PyCharm or VSCode.
We do not consider packages such as numpy, scipy or GPU enablers, but rather things to watch in basic numerical code.
We often compare the results in Python 3 to other languages, for those coming from different languages or converting code.
Contents
- Divide by zero
- Signed zero
- Infinity
- Rounding errors
- Numerical equality
- Not a number (NaN)
- Epsilon
- Big numbers
Divide by zero
When using Python 3, unlike in many other languages, calculations involving floating point numbers can raise exceptions.
To illustrate this, see the code below, where we divide a number by zero.
import math
import sys
print(sys.version)
if sys.version_info.major < 3:
raise Exception('Python 3+ required')
def positive_float_divide_by_zero():
"""
Test that dividing a positive floating point number
by zero raises an exception in Python, as opposed to
many languages which return positive infinity
"""
numerator = 12.0
denominator = 0.0
print(type(numerator))
print(type(denominator))
try:
result = numerator / denominator # in other languages such as Java or C#, this will equal +Infinity.
except ZeroDivisionError:
print('ZeroDivisionError raised')
def negative_float_divide_by_zero():
"""
Test that dividing a negative floating point number
by zero raises an exception in Python, as opposed to
many languages which return negative infinity
"""
numerator = -12.0
denominator = 0.0
try:
result = numerator / denominator # in other languages such as Java or C#, this will equal -Infinity.
except ZeroDivisionError:
print('ZeroDivisionError raised')
def zero_float_divide_by_zero():
"""
Test that dividing zero by zero using floating point
numbers raises an exception in Python, as opposed to many other
languages which will return NaN.
"""
numerator = 0.0
denominator = 0.0
try:
result = numerator / denominator # in other languages such as Java or C#, the result will be NaN
except ZeroDivisionError:
print('ZeroDivisionError raised')
def positive_int_divide_by_zero():
"""
Test that dividing a positive integer by zero raises an exception,
as it does by default in many other languages
"""
numerator = 12
denominator = 0
print(type(numerator))
print(type(denominator))
try:
result = numerator / denominator # in other languages such as Java or C#, this will also raise an exception (unless unchecked)
except ZeroDivisionError:
print('ZeroDivisionError raised')
Signed zero
Most machines today use floating point arithmetic that conform to IEEE 754. They therefore allow the number zero to be positive (+0) or negative (-0). Although a test for equality on +0 and -0 will return true, they can lead to very different results when used in subsequent calculations, as demonstrated in the code sample below.
This is less of an issue with Python, as it is with many other languages, as divide by zero is not allowed.
import numpy as np
def signed_zero():
"""
Demonstrates signed zero.
"""
pos_zero = +0.0
neg_zero = -0.0 # or 0.0 / -1.0
print(pos_zero)
print(neg_zero)
assert 0 == np.sign(neg_zero)
assert pos_zero == neg_zero # a test for equality will return true
# it is possible in Python to then get different values, although much harder than with the likes of Java and C# where floating point div by zero is allowed
pos_val = math.copysign(3.0, pos_zero)
neg_val = math.copysign(3.0, neg_zero)
print(pos_val)
print(neg_val)
assert pos_val != neg_val
# in some languages, such as Java and C#, the following asserts would both return true.
# however, in Python we will get a ZeroDivisionError
try:
assert np.isposinf(1.0 / pos_zero)
except ZeroDivisionError:
print('ZeroDivisionError raised')
try:
assert np.isneginf(1.0 / neg_zero)
except ZeroDivisionError:
print('ZeroDivisionError raised')
Infinity
Infinity is used to signal an overflow condition, i.e. when a number is too big (positive or negative) to be represented by the data type you are using. Infinite values are common in numerical programming, and cannot simply be ignored. It is also important to be aware of some subtleties when manipulating infinite values. The code below shows how infinite values can be created and some important properties, such as the fact that Infinity - Infinity does not equal zero.
Note that we can test for infinity using math.isinf(), which will return true for both positive and negative infinity, or we can use np.isposinf() or np.isneginf() if the sign is important.
import numpy as np
def infinite_values():
"""
Demonstrates how to generate infinite values.
"""
assert np.isposinf(float('inf'))
assert np.isneginf(float('-inf'))
assert np.isposinf(1e250 * 1e250)
assert np.isneginf(1e250 * -1e250)
# In other languages such as Java and C#, you could also use the following
# 1.0 / 0.0 // for positive infinity
# -1.0 / 0.0 // for negative infinity
assert np.isposinf(sys.float_info.max * 2.0)
import numpy as np
def inf_properties():
"""
Demonstrates some properties of infinite values to be aware of.
"""
pos_inf = float('inf')
neg_inf = float('-inf')
assert 0.0 != (pos_inf - pos_inf)
assert 0.0 != (pos_inf + neg_inf)
assert np.isnan(pos_inf / pos_inf)
Rounding errors
The limited size of a floating point data type, means that numbers and arithmetic using floating point representations can only be performed with limited accuracy.
For example 0.5 - 0.4 - 0.1 will not exactly equal 0.0, as shown in the code below. The difference between the approximate value and the expected mathematical quantity is call a rounding error (or round off error). This is standard behavior in programming languages, and cannot be ignored when writing scientific code.
There are a number of important side effects of rounding errors, which software engineers should be aware of. As described above, since 0.5-0.4-0.1 does not exactly equal zero when using floating point arithmetic, we must take care when testing for equality. Another problem can be caused when round off errors accumulate, often leading to disastrous consequences (e.g. the European Space Agency Ariane rocket launched in 1996).
Rounding errors can also be caused when converting from one data type to another.
def rounding_error():
"""
Demonstrates rounding errors due to the precision of floating point numbers and arithmetic.
"""
x = 0.5 - 0.4 - 0.1
print(x)
print(x == 0.0)
assert x != 0.0
Numerical equality
As explained in the section on rounding errors floating point arithmetic has limited accuracy, meaning that 0.5 - 0.4 - 0.1 will not exactly equal zero. Therefore care must be taken, testing for equality when numerical programming.
- If possible, remove the test for equality altogether.
- If possible change the test to use either <= or >=.
- Test that the value lies within a certain tolerance, in either absolute or relative (percentage difference) terms.
Note however that testing whether a value lies within certain bounds, is not always straight forward, as the choice of the tolerance, and whether to test the absolute or relative difference will depend on how many calculations have been performed, and the magnitude of the numbers involved.
In some cases (e.g. unit tests) a simple small absolute tolerance can be used, but in more complex situations Knuth's algorithm for testing for equality, or a thorough numerical analysis may be required.
Not a number (NaN)
NaN represents the result of an invalid or undefined calculation.
The code below demonstrates a number of ways to generate NaN, and how we can determine if a number is NaN.
When writing numerical code, it is important to consider all eventualities. For example, every time a division is performed, is it possible that the code might divide by 0.0 and raise an exception?
Although out of scope of this article, it should be mentioned that in general programming, there are actually two types of NaN. The first is called a signalling NaN (SNaN or 1.#INF) denoting that the operation was invalid, and the second is a quiet NaN (QNaN or –1.#IND), denoting that the operation is undefined.
import numpy as np
def nan_values():
"""
Demonstrate some different ways of generating Not a Number (NaN)
"""
pos_inf = float('inf')
assert np.isnan(float('nan'))
# assert np.isnan(0.0 / 0.0) // this will raise a ZeroDivisionError in Python, but would be allowed in Java, C# and may other languages
# assert np.isnan(math.sqrt(-1.0)) // this will raise a math domain error, but may return NaN in other languages
assert np.isnan(pos_inf / pos_inf)
assert np.isnan(pos_inf - pos_inf)
def nan_equality():
"""
Demonstrate NaN equality and testing for NaN
"""
x = float('nan')
assert x != float('nan')
assert np.isnan(x)
Epsilon
Often numerical code uses a constant known as machine epsilon, or the square root of machine epsilon, to compensate for the effects of rounding errors. Machine epsilon equals the smallest number such that (1.0 + machine epsilon) != 1.0.
Some languages have types with a constant defined called Epsilon which represents the smallest positive float value that is greater than zero.
These two constants have very different values, but can be confused, especially when converting code from one language to another.
For more information see:
- sys.float_info.epsilon
- np.finfo(float).eps
- np.finfo(np.float32).eps
Big numbers
Unlike in many other languages which have a special class (often called BigNumber) to handle very large integer values, Python 3 has built in support with the integer data type.
From the Python 3 docs: 'Integers have unlimited precision'