Ad

Pandas: Filtering Rows With Data In Particular Column On The Separator Basis

- 1 answer

I was trying to return only those rows when any value in column director has a separator '|'. But it is not filtering on the separator basis and instead shows all the rows. Please let me know the possible issue with this.

I tried the following:

hb_dctr = df_updated[df_updated['director'].str.contains('|')]
hb_dctr

But it shows following

id      popularity   budget     Cast                        director
135397  32.985763    150000000  Chris Pratt|Irrfan Khan     Colin Trevorrow
76341   28.419936    150000000  Tom Hardy|Charlize Theron   George Miller
76757   6.189369     176000003  Mila Kunis|Channing     Lana Wachowski|Lilly Wachowski

It should only show rows that with id 135397 and 766341

Ad

Answer

Escape | because special regex character (or):

df1 = df[df.director.str.contains("\|")]
print (df1)
      id  popularity     budget                      Cast  \
2  76757    6.189369  176000003  Mila Kunis|Channing Lana   

                    director  
2  Wachowski|Lilly Wachowski  

For not contains use ~:

df2 = df[~df.director.str.contains("\|")]
print (df2)
       id  popularity     budget                       Cast         director
0  135397   32.985763  150000000    Chris Pratt|Irrfan Khan  Colin Trevorrow
1   76341   28.419936  150000000  Tom Hardy|Charlize Theron    George Miller

Details:

print (df.director.str.contains("\|"))
0    False
1    False
2     True
Name: director, dtype: bool

print (~df.director.str.contains("\|"))
0     True
1     True
2    False
Name: director, dtype: bool
Ad
source: stackoverflow.com
Ad