Ad

Flatten Nested List In Pandas Containing Nan

I have a table like this:

index | country
---------------
1     | [nan]
2     | [nan, DE]
3     | [nan, [IT, DE]]
4     | [[FR]]
5     | [[AE], nan, [AE,  MT], [MX]]

And i need to turn this column into a flat list of unique values without nans

index | country
---------------
1     | []
2     | [DE]
3     | [IT, DE]
4     | [FR]
5     | [AE, MT, MX]

As a first step i tried to flatten the list with this function

df.applymap(lambda x: [z for y in x for z in y])

But I get the following error:

TypeError: 'float' object is not iterable

I tried several other functions that I found in this SO question here but all end up giving me the same error.

Ad

Answer

This should work for any nested lists

from collections.abc import Iterable
def flatten(l):
    for el in l:
        if isinstance(el, Iterable) and not isinstance(el, (str, bytes)):
            yield from flatten(el)
        else:
            yield el

So recreating your df

import pandas as pd
df = pd.DataFrame([[[[float('nan')],[float('nan'), 'DE']]],
                   [[[float('nan'), ['IT', 'DE']]]],
                   [[[['FR']]]],
                   [[[['AE'], float('nan'), ['AE',  'MT'], ['MX']]]]],columns=['country'])

df['country'] = df['country'].apply(lambda x:list(set(flatten(x)))).apply(lambda x: [i for i in x if str(i) != 'nan'])

gives the following output

    country
0   [DE]
1   [IT, DE]
2   [FR]
3   [AE, MT, MX]
Ad
source: stackoverflow.com
Ad