Exploring Randomness in Python: A Detailed Overview
Written on
Understanding Randomness in Python
Randomness plays a crucial role in various fields, including software development and data science. However, computers struggle with true randomness. Since computers are designed to be deterministic, any algorithm we create will produce predictable outcomes. As a result, computers cannot generate genuinely random numbers. Instead, they can produce pseudo-random numbers using specific techniques.
A pseudo-random number generator typically begins with a "seed" value and follows a defined pattern. Consequently, if the seed changes, the generated number will also change. Python utilizes the Mersenne Twister algorithm as its primary pseudo-random generator within the built-in random module. This article will provide a comprehensive look at this module and its functionalities.
"One thing that traditional computer systems aren't good at is coin flipping." — Steve Ward, Professor of Computer Science and Engineering at MIT.
Seed Management for Randomness
By default, Python's random number generator utilizes the current system time as its seed, which is an intelligent choice since the time continuously varies. Additionally, Python allows us to modify the seed using the random.seed() function. To illustrate this, we can set the same seed value multiple times and observe the outcomes:
>>> import random
>>> random.seed(1)
>>> random.random()
0.13436424411240122
>>> random.seed(1)
>>> random.random()
0.13436424411240122
The results remain consistent when the seed is identical, demonstrating why it is referred to as "pseudo-random."
Two other noteworthy functions are random.getstate() and random.setstate(). The former retrieves the current internal state of the generator, while the latter sets a specific generator state. These methods can be particularly useful:
>>> import random
>>> state = random.getstate()
>>> random.random()
0.2550690257394217
>>> random.random()
0.49543508709194095
>>> random.setstate(state)
>>> random.random()
0.2550690257394217
As shown, we can reliably obtain the same "random" number by manipulating the generator's state.
Generating Random Numbers
In general, there are two main types of numbers we need to create: integers and floating-point numbers.
Generating Random Integers
To produce a random integer within a specified range, we can use the randrange(start, stop, step) method:
>>> import random
>>> random.randrange(1, 3)
2
>>> random.randrange(1, 3)
1
>>> random.randrange(1, 3)
1
This method, similar to Python's range() function, does not include the endpoint. We can also specify a step value, which can be very convenient. For example, the following code will generate even integers between 0 and 10:
>>> random.randrange(0, 11, 2)
8
Alternatively, the randint(start, stop) function can be used, which is an alias for randrange(start, stop+1), since it includes the endpoint:
>>> import random
>>> random.randint(1, 3)
2
>>> random.randint(1, 3)
3
>>> random.randint(1, 3)
1
Generating Random Floating-Point Numbers
The random() function generates a random floating-point number between 0 and 1 (excluding 1):
>>> random.random()
0.6596661752737053
If we want to define a specific range, we can use the uniform(a, b) method:
>>> random.uniform(0.5, 3.1)
2.6947163565041437
Generating Random Numbers Based on Distributions
At times, we need to generate random numbers according to specific statistical distributions. Python provides built-in methods for commonly used distributions, which are particularly beneficial for data science. For instance, the gauss(mu, sigma) method returns a random floating-point number based on the Gaussian distribution:
>>> random.gauss(0, 1)
0.7388503877433976
Several additional methods are available for generating numbers based on various statistical distributions, including:
- betavariate(): Beta distribution
- expovariate(): Exponential distribution
- gammavariate(): Gamma distribution
- lognormvariate(): Log-normal distribution
- normalvariate(): Normal distribution
- vonmisesvariate(): von Mises distribution
- paretovariate(): Pareto distribution
- weibullvariate(): Weibull distribution
Randomly Selecting Items
There are three useful methods for selecting items randomly:
- The choice() method allows us to select a single item from an iterable:
>>> leaders = ['Yang', 'Tim', 'Elon']
>>> random.choice(leaders)
'Yang'
- The sample() method enables us to choose multiple items randomly:
>>> leaders = ['Yang', 'Tim', 'Elon']
>>> random.sample(leaders, 2)
['Elon', 'Yang']
- The choices() method lets us assign different weights to each item for random selection:
>>> leaders = ['Yang', 'Tim', 'Elon']
>>> random.choices(leaders, weights=[3, 1, 1], k=5)
['Tim', 'Elon', 'Yang', 'Yang', 'Yang']
Shuffling Iterable Objects
Shuffling is straightforward with the shuffle() method:
>>> nums = [1, 2, 3, 4, 5]
>>> random.shuffle(nums)
>>> nums
[4, 3, 2, 5, 1]
The shuffle() method modifies the list in place and does not return a new list. It can be applied to any iterable, such as lists or tuples.
Key Takeaways
While computers cannot produce truly random numbers, they can generate pseudo-random numbers that suffice for most development scenarios. Python's built-in random module offers a variety of tools for managing randomness, including:
- Controlling the seed of the pseudo-random generator
- Generating random numbers
- Randomly selecting items from an iterable
- Shuffling iterable objects
For further reading, here’s an intriguing article about random number generation in Python: