Toolbox: Name and Frequency of unique elements from a List in Python
January 23, 2023
Sometimes we need to get names and counts of unique elements from a list or array. Usually, we use pandas
method .value_counts()
which returns a frequency table. In some cases, it is not convenient to use pandas
. Maybe our processing pipeline uses only numpy (a common scenario with ML models), but at some intermediate data processing step, we need to check counts of categorical values. My current use case: I perform clustering and want to log how many values were assigned to each label.
The function for this task is short:
from typing import Dict, Iterable import numpy as np def get_categories_frequency(ds: Iterable) -> Dict: """ Function calculates how many times each unique category appears in a list. Parameters ---------- ds : Iterable List-like object with categorical data. Returns ------- : Dict Dict with ``{category: frequency}`` records. """ lbls, counts = np.unique(ds, return_counts=True) zipped = dict(zip(lbls, counts)) return zipped
Subscribe
Login
0 Comments
Oldest