Toolbox: Check if the text from DataFrame is a part of another phrase with Python and Pandas
Imagine that we have a database with specific words. We expect them to be a part of a longer sentence; for example, they might appear in URL. Pandas has the method
isin(), but it checks only exact matches. Another method named
contains() checks if strings in our
Series contain a specific phrase. However, we have a reversed problem. We want to find if strings in our Series are part of another text fragment.
How to do it? The minimally viable solution is to use lambda expression:
import pandas as pd texts = ['apple', 'orange', 'berry'] phrase = 'https://www.apple.com/' series = pd.Series(texts) test_output = series.apply(lambda x: True if x in phrase else False) print(test_output)
0 True 1 False 2 False dtype: bool
Output is a boolean series that we can later use to find the records of interest.