Ad

Regex From Stringr::str_detect Works, But The Same Regex From Tidyselect::matches Returns An Error

- 1 answer

I'm confused about this inconsistency in the tidyverse and am not sure what's going on.

Test data:

test <- data.frame(test_gibberish = 1,
                   test_prob_gibberish = 2)

I now want to check if there is a column that ends with "_gibberish", but is not preceeded by "_prob".

This one works and returns the correct result:

stringr::str_detect(names(test), "(?<!_prob)_gibberish$")
[1]  TRUE FALSE

However, this one returns an error, despite using exactly the same regex:

test |> 
  dplyr::select(tidyselect::matches("(?<!_prob)_gibberish$"))

Error in `dplyr::select()`:
! invalid regular expression '(?<!_prob)_gibberish$', reason 'Invalid regexp'
Run `rlang::last_error()` to see where the error occurred.
Warning message:
In grep(needle, haystack, ...) :
  TRE pattern compilation error 'Invalid regexp'

Is my regex wrong? Is stringr wrong? Is tidyselect wrong?

Ad

Answer

By default, perl = FALSE in matches according to ?tidyselect::matches

matches(match, ignore.case = TRUE, perl = FALSE, vars = NULL)

test |> 
  dplyr::select(tidyselect::matches("(?<!_prob)_gibberish$", perl = TRUE))

-output

   test_gibberish
1              1

The lookaround expression will be a valid perl expression

Ad
source: stackoverflow.com
Ad