Ad

How Can I Group Words Of A String Based On A List Of Words?

- 1 answer

i have a list of words and a string and i want to create a new list if same words of a list are in the string and the next word of the string is also present in string it will append them store them as a new element in the list.

keyword_list = ['individual', 'fixed', 'treatments', 'deposit', 'health',
                'millions', 'panic', 'decision', 'policy', 'insurance', 'account']

string1 = 'i want to buy individual insurance policy and you can get upto 2 millions for the cover do not panic i also want to open fixed deposit account'

new_list = ['individual insurance policy',
            'millions', 'panic', 'fixed deposit account']
Ad

Answer

You can group the elements based on their presence in the keyword_list and join the groups with " ".

>>> data = 'i want to buy individual insurance policy and you can get upto 2 millions for the cover do not panic i also want to open fixed deposit account'
>>> keyword_list = ['individual', 'fixed', 'treatments', 'deposit', 'health',
...                 'millions', 'panic', 'decision', 'policy', 'insurance', 'account']

Now, let's convert the keyword_list to a set so that the lookups will be faster.

>>> keys = set(keyword_list)

Now, let's group the words in data based on their presence in keys, like this

>>> from itertools import groupby
>>> [" ".join(grp) for res, grp in groupby(data.split(), keys.__contains__) if res]
['individual insurance policy', 'millions', 'panic', 'fixed deposit account']

For every element in the collection passed to groupby, in our case it is data.split(), the keys.__contains__ function will be called. And based on the result of that function call, the groups will be formed. Since we are interested only in the items which are present in keys, we filter with if res in the list comprehension.

Ad
source: stackoverflow.com
Ad