Ad

Split Two Tags And Append Them Separately In Bs4 Python

I have a TR[2] which is dynamic and I try to get it like this:

self.soup.select("#detail > tbody > tr > td:nth-of-type(2)")

I want all td[3] in it which they are dynamic in this way: they may have only string or both string and <a href> now I want to split the string in some variable and "string" of that <a> tag in another but the thing that matter is that td which has no <a> I want it to append "None" because both variables should have the same length and index in order to "zip" them correctly for further use. here is some example :

<td class='bolt'>
  "the text I want"
  <br>
  <a href='Javascript:void(0);'>the other text i want</a>
</td>

which when they are appending to var should look like this:

event = ["the text I want"]
vessel = ["the other text i want"]

and another 'possible' td:

<td class='bolt'>
   "another string we need"
</td>

and the final result :

event = ["the text I want","another string we need"]
vessel = ["the other text i want", None(or empty),]
Ad

Answer

If there can be one or two text nodes (as described in question), you can use

vessel = []
event = []
for td in self.soup.select("#detail > tbody > tr > td:nth-of-type(2)"):
    event.append([i.strip() for i in td.strings if i.strip()][0])
    vessel.append(([i.strip() for i in td.strings if i.strip()] + [None])[1])

print(event)
['"the text I want"', '"another string we need"']
print(vessel)
['the other text i want', None]

Let me know in case there might be more complex cases

Ad
source: stackoverflow.com
Ad