Ad

R String Split On Parentheses, Keeping The Parentheses In The Split With Its Content

- 1 answer

I am trying to split strings of a format

x <- "A(B)C"

where A, B and C could be empty strings or any sets of characters except for parentheses. The parentheses are always there - I want to keep them around the characters they enclose, so that the result would be:

"A" "(B)" "C"

So far my best try was:

strsplit(x, "(?<=\\))|(?=\\()", perl = TRUE)
[[1]]
[1] "A"  "("  "B)" "C"

but that keeps the opening parenthesis separate. Any ideas?

Ad

Answer

You can use

x <- "A(B)C"
library(stringr)
str_extract_all(x, "\\([^()]*\\)|[^()]+")

See the R demo and the regex demo. Details:

  • \([^()]*\) - a (, zero or more chars other than ( and ) and then )
  • | - or
  • [^()]+ - one or more chars other than ( and ).
Ad
source: stackoverflow.com
Ad