Ad

Split Data Frame String Column Into Multiple Columns (comma Separated Characters)

- 1 answer

I am trying to split a column of comma separated characters into several columns (as many as different characters overall). I read similar questions such as this:

Split data frame string column into multiple columns

But the solutions were based on a small number of possible characters, so that the new columns could be named in advanced before the columns was split.

This is my example:

subject <- c(1,2,3)
 letters <- c("a, b, f, g", "b, g, m, l", "g, m, z")

df1 <- data.frame(subject, letters)

df1

 subject    letters
1       1 a, b, f, g
2       2 b, g, m, l
3       3    g, m, z

The desired result would be:

  subject a b f g m z
1       1 1 1 1 1 0 0
2       2 0 1 0 1 1 0
3       3 0 0 0 1 1 1

Thanks in advance for any help.

Ad

Answer

One option using str_split, unnest_longer and table

subject <- c(1,2,3)
letters <- c("a, b, f, g", "b, g, m, l", "g, m, z")

df1 <- data.frame(subject, letters)


library(tidyverse)

df1 %>%
  mutate(letters = str_split(letters, ', ')) %>%
  unnest_longer(letters) %>%
  table
#>        letters
#> subject a b f g l m z
#>       1 1 1 1 1 0 0 0
#>       2 0 1 0 1 1 1 0
#>       3 0 0 0 1 0 1 1

Created on 2022-02-10 by the reprex package (v2.0.0)


Seeing some of the other answers, separate_rows is a better solution here

df1 %>%
  separate_rows(letters) %>%
  table
Ad
source: stackoverflow.com
Ad