Ad

Getting All Unqiue Strings From A List Of Nested List And Tuples

- 1 answer

Is there a fast way to get the unique elements, especially the strings from a list or tuple of nested lists and tuples. Strings like 'min' and 'max' should be removed. The lists and tuples could be nested in any possible way. The only element which will always be the same are the tuples at the core like ('a',0,49), which contains the strings.

Like those list or tuple:

lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]

tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]) 

Wanted Output:

uniquestrings = ['a','b','c','e']

What I tried so far:

flat_list = list(sum([item for sublist in x for item in sublist],()))

But this does not go to the "core" of the nested object

Ad

Answer

This will get any string inside the given iterable, regardless of position inside the iterable:

def isIterable(obj):
    # cudos: https://stackoverflow.com/a/1952481/7505395
    try:
        _ = iter(obj)
        return True
    except:
        return False

# shortcut
isString = lambda x: isinstance(x,str)

def chainme(iterab):
    # strings are iterable too, so skip those from chaining
    if isIterable(iterab) and not isString(iterab):
        for a in iterab:
            yield from chainme(a)
    else: 
        yield iterab

lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]

tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]) 


for k in [lst1,tuple1]:
    # use only strings
    l = [x for x in chainme(k) if isString(x)]
    print(l)
    print(sorted(set(l)))
    print()

Output:

['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b'] # list
['a', 'b', 'c', 'e', 'max']                # sorted set of list

['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b']
['a', 'b', 'c', 'e', 'max']
Ad
source: stackoverflow.com
Ad