Working with Sets in NumPy
In mathematics, a set is simply a collection of unique elements — meaning no duplicates.
Sets are extremely useful in operations involving union, intersection, difference, and symmetric difference. In Python, NumPy makes it easy to work with sets using its built-in methods.
Let’s explore how to perform set operations using NumPy!
Creating Sets in NumPy
NumPy doesn’t have a special set data type, but you can simulate sets by removing duplicates from arrays using np.unique().
Tip: NumPy set operations work on 1-D arrays.
Program:
Removing Duplicate Elements from an Array
import numpy as np
arr = np.array([1, 1, 1, 2, 3, 4, 5, 5, 6, 7])
unique_arr = np.unique(arr)
print(unique_arr)
Output:
[1 2 3 4 5 6 7]
Finding Union of Sets
The union of two sets combines all unique elements from both sets. Use np.union1d() for this operation.
Program:
Union of Two Arrays
import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
union_arr = np.union1d(arr1, arr2)
print(union_arr)
Output:
[1 2 3 4 5 6]
Finding Intersection of Sets
The intersection returns the elements that are common to both sets. Use np.intersect1d() to find it.
Program:
Intersection of Two Arrays
import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
intersection_arr = np.intersect1d(arr1, arr2, assume_unique=True)
print(intersection_arr)
Output:
[3 4]
Finding Difference Between Sets
The difference returns elements present in the first set but not in the second. Use np.setdiff1d() for this.
Program:
Difference of Two Arrays
import numpy as np
set1 = np.array([1, 2, 3, 4])
set2 = np.array([3, 4, 5, 6])
diff_arr = np.setdiff1d(set1, set2, assume_unique=True)
print(diff_arr)
Output:
[1 2]
Finding Symmetric Difference
The symmetric difference returns elements that are in either of the sets but not in both. Use np.setxor1d().
Program:
Symmetric Difference of Two Arrays
import numpy as np
set1 = np.array([1, 2, 3, 4])
set2 = np.array([3, 4, 5, 6])
sym_diff_arr = np.setxor1d(set1, set2, assume_unique=True)
print(sym_diff_arr)
Output:
[1 2 5 6]