Ad
Python Splitting A Setence Based On Several Tokens
I want to split a sentence based on several keywords:
p = r'(?:^|\s)(standard|of|total|sum)(?:\s|$)'
re.split(p,'10-methyl-Hexadecanoic acid of total fatty acids')
This outputs:
['10-methyl-Hexadecanoic acid', 'of', 'total fatty acids']
Expected output: ['10-methyl-Hexadecanoic acid', 'of', 'total', 'fatty acids']
I am not sure why the reg. expression does not split based on the token 'total'.
Ad
Answer
You may use
import re
p = r'(?<!\S)(standard|of|total|sum)(?!\S)'
s = '10-methyl-Hexadecanoic acid of total fatty acids'
print([x.strip() for x in re.split(p,s) if x.strip()])
# => ['10-methyl-Hexadecanoic acid', 'of', 'total', 'fatty acids']
See the Python demo
Details
(?<!\S)(standard|of|total|sum)(?!\S)
will match and capture into Group 1 words in the group when enclosed with whitespaces or at the string start/end.- Comprehension will help get rid of blank items (
if x.strip()
) andx.strip()
will trim whitespace from each non-blank item.
Ad
source: stackoverflow.com
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module
Ad