Python Splitting A Setence Based On Several Tokens
I want to split a sentence based on several keywords:
p = r'(?:^|\s)(standard|of|total|sum)(?:\s|$)' re.split(p,'10-methyl-Hexadecanoic acid of total fatty acids')
['10-methyl-Hexadecanoic acid', 'of', 'total fatty acids']
Expected output: ['10-methyl-Hexadecanoic acid', 'of', 'total', 'fatty acids']
I am not sure why the reg. expression does not split based on the token 'total'.
You may use
import re p = r'(?<!\S)(standard|of|total|sum)(?!\S)' s = '10-methyl-Hexadecanoic acid of total fatty acids' print([x.strip() for x in re.split(p,s) if x.strip()]) # => ['10-methyl-Hexadecanoic acid', 'of', 'total', 'fatty acids']
See the Python demo
(?<!\S)(standard|of|total|sum)(?!\S)will match and capture into Group 1 words in the group when enclosed with whitespaces or at the string start/end.
- Comprehension will help get rid of blank items (
if x.strip()) and
x.strip()will trim whitespace from each non-blank item.
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module