Pareto Distribution in Python
The Pareto Distribution is a statistical distribution that follows Pareto's Law, often referred to as the 80/20 rule. This principle states that 20% of causes lead to 80% of the results — a phenomenon commonly seen in economics, business, and natural sciences.
Key Parameters:
- a (shape parameter): Determines the "shape" of the distribution. Higher values of 'a' make the distribution steeper.
- size: Specifies the output shape of the array containing random samples.
Program:
Drawing Samples from a Pareto Distribution
Here’s how you can generate random numbers following a Pareto distribution using Python’s NumPy library:
from numpy import random
# Draw samples from Pareto distribution with shape parameter a=2 and size 2x3
x = random.pareto(a=2, size=(2, 3))
print(x)
Output:
[[0.38457645 0.4588947 0.39020769]
[1.13945125 0.08436215 0.66941928]]
Visualization: Pareto Distribution
To visualize the distribution of the samples, we can use Matplotlib and Seaborn libraries:
Program:
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
# Generate 1000 random samples
samples = random.pareto(a=2, size=1000)
# Create a histogram using Seaborn
sns.displot(samples, kde=False)
plt.title("Pareto Distribution (a=2)")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()
Explanation:
- The histogram shows that most values are concentrated towards the lower end, with a long tail stretching towards higher values — a signature of the Pareto distribution.
- By adjusting the shape parameter (a), you can control the spread and steepness of the distribution.