Dplyr Questions
Ad
Find monthly plane/aircraft usage from the nycflights13 data set
I would like to find the monthly usage of all the aircrafts(based on tailnum) lets say this is required for some kind of maintenance activity that
error when using filter function to remove rows with missing values in one column
I get the following message of error when using filter() function to remove rows with missing values in the column "medecin" :
get the no. of observations in every level of factor after grouping by factors
In the data arthritis of package 'vcd', after grouping by treatment and sex, i would like to get the no. and percentage of observations in every
tidyverse solution for multiplying columns by a vector
I looked for solutions here: multiply columns
Creating a loop that full jon dataframes in R
I have six dataframes: a2020, b2020, a2021, b2021, a2022, and b2022. all six dataframes have a common key variable called
How to find the earliest date across multiple columns in R (Issue with NAs)
I have 3 date columns (class-date) and i want to create a new column that will have the earliest of the 3 dates. this is the code i used
Mutate data to juxtapose repeat measurments
Let's pretend i am measuring the distance the distance grasshoppers can jump pre- and post-treatment. this is just for fun, the real measurement
How many of each distinct item in a column is assigned to each distinct item of another column?
I am trying with dplyr functions group_by() and summarize(), count() but i can not figure out
R mutate across with function, case_when and data masking to parse timestamps
I am trying to parse some timestamps (character vectors) as datetimes using r mutate and case_when. dummy
remove repeated records only for some rows with dplyr
I'd like to remove records repeated twice or more times (based on variables: id, start, drug), but only for
Ad
reorder x-axis with heatmap in R
I am used to using ggplot2, so i have only used fct_inorder() to reorder my axes. it should be gate 0 - 3 then full. how do you do this with a
Mutate multiply columns based on conditional and column name
I have a dataframe with the following structure (see example). the dots after operatedin2007 column signify multiple columns with same
Create empty columns with mutate, column names from a vector
There is a data frame df containing one column. df <- data.frame(pseudonym = c("a", "b", "c",
How to filter by multiple range of dates in R?
Thank you, experts for previous answers (how to filter by range of dates in r?)
Data cleaning & subsetting in nested list
I couldn't find any previous questions which addresses these steps in a nested list. my own attems hasn't got me anywhere either! i have a
Convert dataframe so that it can be consumed by geom_line()
I need to convert a dataframe to a tibble like structure so that it can be consumed by geom_line() (ggplot2) to create a line plot.
How to use dynamic column name in quantile function in R?
I have trouble with dynamic variable that i want to use in quantile function. i tried to use paste0 for column name, that
Create lagged variables for consecutive time points only using R
I have an unbalanced panel (with unequally spaced measurement points) and would like to create a lagged variable of x by group
R adding a new column in a data frame by recursivly filling 1:3 value based on another column
I have a dataframe, e.g.: df <- data.frame(group = rep(c("a", "b", "c"), times = c(4, 4, 6)) >
How to automatically transform columns into objects in R?
I need that each column of my dataset become an object named with that column's name and containing its values as the object value. i know
How to copy a value from one column to a new column if column A > column B?
I am looking to create a new column called true_water_on in the dataframe trial_a. study_id randomisation
Ad
Plotting monthly average over time of a column grouped by another column - RStudio
I have a dataframe df_have that looks like this: arrival_date cust_id wait_time_mins cust_priority
R handle vector values in addition to scalar values with rounding function
Round2 <- function(x, y) { if(is.na(x)) { return ('') } return(paste0(round(100 * x, 0), '% ', round(100 * y, 0), '%')) } mydf
How to add a row to a dataframe modifying only some columns
In order to prepare the data for plotting i need to add a new row to the data: i have this dataframe: df
Custom function to loop over a list
I have a working custom function but not sure how to allow it to loop with a list of inputs. looks like i need to understand apply() and the such
How to change component's state using setstate
React's beginner here! i have a component with a state, like this: this.state = { values: [], isloading: true,
Referring to columns by name in for loop
I have a loop in r that loops over columns in a dataframe las_ref for those columns where the name matches a value in a vector
Combine smallest elements in one category 'Other' in a pie chart R
I am trying to plot a pie chart with only 3 segments. i want the 2 largest elements and after that all the smallest elements in one category
Add column with row number count of grouped column values while excluding NA's
I am trying to add a column to a data frame with a row number by groups. i do not want to count na's. i have grouped by day and want to add a
R+Tidyverse: Tibbles don't appear to store milliseconds
I have a csv with many values. among them are times stored like this: 1:34.434 using readr, i form
How to reactively extract column of values from tibble for plotting?
I have an app that allows the user to stratify data, and select the point-in-time to stratify. a function (stratdata(...)) in the
filter() function having issues in filtering rows using selectinput(multiple)
The following code filters rows of co2 with specific names from the 'plant' column. however i have been having trouble doing this. when i use
Ad
De-aggregate a data frame
There have been many similar questions (e.g.
paste()ing columns whose names are stored in a variable
How do i create a new column from columns whose names are contained in a character vector? given these two variables: data
add coloured dot in new column using dplyr
I want to add a colored dot in a new column based on the value of another column. i have: data <- tibble( one =
How do you convert an integer into a date (format YYYY) in R
I have a big data set, filled with dates that are in integer form, and different types of integer form (yyyymmdd, yyyymm and yyyy) i just
Set Preceding and Succeeding rows
I want all the values in column values prior to string "%" should be flagged as "yes". else "no". it
How to input all 50 states in the United States without having to type them all in R?
I am trying to delete the rows in my dataset that has a u.s state name. i hope there is a way to input into the code without typing all 50 of
Loop R merge multiple data frames together
I am trying to merge multiple dataset (left_join) together inside a loop. here is what the data looks like: fr1
Chi -Square test in R using dplyr
I would like to perform a chi-square test in r using dpylr. specifically, i would like to investigate whether there is a difference in customer
Generate Variable Inside Loop Dplyr
I am trying to generate a variable inside a loop using mutate. this variable should simply be the concatenate of the word
r create a unique numeric value for every id in column
I have a dataset with a long list of random ids like this. id h001 h00a h00m b00a bb0b ab0a aa0b aa0b
R : How to extract the factor levels as numeric from a column and assign it to a new column using tydyverse?
Suppose i have a data frame, df df = data.frame(name = rep(c("a", "b", "c"), each = 4))
Ad
Recoded missing cells to NA but still showing up in table
I'm trying to build a table using gtsummary. i"m converting columns to factors, then recoding the factors, then assigning values ("
Count unique strings that only occur in a single group based on all possible groups
I have the following df a = data.frame(pa = c("a", "a", "a", "b",
R using a list of column names to remove unwanted character from data
I have read several csvs into a tibble and am working to clean up the data. i have several columns where i need to remove the '%' character from
left_join produces NAs when key has spaces
I'm getting an unexpected pattern of nas from a left join. the data come from
Dividing a number within a month with the last observation in the previous month using dplyr
I am struggling with finding the correct way of achieving the relative return within a month using the last observation in the previous month.
Pivot longer in dplyr for mutiple value columns
I have the following table that can be generated using this code structure(list(total = c(9410, 12951.1794783802), op =
R: Set next row to NA in group_by
I want to set the next row i+1 in the same column to na if there is already an na in row i and then do this by groups. here is my attempt:
Which is the fastest manner to the derive the conditional minimum value of an R data frame column?
Suppose we have this data frame: > data id period_1 values 1 1 2020-03 -5 2 1 2020-04 25 3 2 2020-01 35 4 2
Conditional sampling by group based on sample mean
I am trying to use r to make a bunch of different trivia quizzes. i have a large dataset (quiz_df) containing numerous questions divided into
select_at() drop some vars, pull some to front and then everything() in one call?
Example, i want to drop field mpg, select carb so that it's first, then just everything that's left over in their
Is there a R function or better way to sum rows and columns cumulatively?
Let´s suppose a data frame df1 <- data.frame(a = c(1,2,3), b = c(4,5,6), c = c(7,8,9), d = c(10,11,12)) i want
Ad
How do I get the aggregate number of a variable against another variable in R? Both these variables are non-numeric
I have this dataset, and i am trying to create a new variable (n_commitments) that will give me an aggregate number of paragraphs per country. i
Replace column values in a df with matching index with new values in R
I have df, containing 2 variables, df and val. df contains numbers from 1-255 and val is random numbers generated. i also have new_vals that is a
How to filter a dataframe using a preset vector in R
In the dataframe below, i want to filter column code using a preset vector x # dataframe set.seed(123)
In R, how can I filter out specific values in an array using dplyr's piping operator (%>%)?
How can i use the dplyr/magrittr piping operator (%>%) to filter/subset an input array and remove a
Filter dataframe within a group with one column meeting an AND condition in R
I have the following dataframe for which i need to filter only those rows that have both an "intake" and "discharge" per group
R, rate of growth, different time
I have the following example of dataset for a biology project. i want to compute a rate of growth of number between 4th january and 2nd of
How to get the difference between rows based on a first observation?
I have two countries with different starting and endings years. i have the mean earnings of different social classes. i would like to have the
Break a small sentence in multiple rows with a single string each in R dplyr
I have a data frame that looks like this library(tidyverse) data=data.frame(pos=c(172367,10),
How to pass column name as an arguments in the function
I am trying to convert ggplot code which is repetitive into function i have done almost but only concern is i need to pass the application as a
Errors in counting + combining bing sentiment score variables in Tidytext?
I'm doing sentiment analysis on a large corpus of text. i'm using the bing lexicon in tidytext to get simple binary pos/neg classifications, but
New column based on if col1 is a substring of col2
I'm trying to make a new column based on whether one column is a substring of another. using if_else & grepl works with a constant, but not
Ad
Extract strings based on a database in R dplyr
From my data i want to extract the strings that are between the l and r string from my database. my database includes 4 different l and r
Creating new column by splitting a `chr` column, finding unique values, sorting them, removing certain values, and combining them back into one string
I'm working in r, using tidyverse and dplyr functions to generate new columns, but i'm running into a wall when trying
Efficient iteration to create df columns based on newly created columns (alternative to loop)
I need to create multiple columns in a data frame based on the new columns. for this aim, i have a loop that works fine but requires quite a lot
How to find the maximum value within each group and then recode all other values in the group as zero?
I have a data frame with the following simplified structure: df <- data.frame(id = c(1,1,1,2,2,2,3,3,3,4,4,4), value =
keep first row after calculating difference between rows with dplyr::lag
My question is similar to this
number of matches from another dataframe
Help me, please ) i have 2 dataframes, and i want add to df1 additional column with number of matches in df2 for pattern in "pep"
How to continue tidyr/dplyr/tidyverse %>% commands to the next line?
I have a long command with a long line of dplyr/tidyr commands: object %>% mutate() %>% select() %>% separate() %>%
Create a column for each group when the other column changes
I have the following sample dataset where i want to create the number_of_renewals column. basically, when the value for
group_by: different behavior mutate and summarize
I have a dataframe with revenue numbers for different clients in different years. i want to calculate the total revenue for a client in a year.
Identify missing number in a new column
I have a data frame called 'dat' that has two separate columns to denote the order that participants completed a study. it looks like the
Comparing each row to a solutions vector with the correct answer and converting the orginal values based on the solutions in dplyr
I have one data frame containing the original answers and one dataframe that has the solutions to these answers. i want to compare the answers
Ad
Expand dataset by count column in Dplyr
I have a dataset as follows: library(tidyverse) df <- data.frame( report_date = c("2020-03-14",
How to get the highest value in a column depending on three other columns?
The aim is to get the highest educational value between two partners in a household by disregarding the educational level of the children. the
How to create a column from different dataframes?
I have df1: idcode random 11 8 2 9 3 10 18 3 21 2 6 4 9 5 10 4
Aggregation and mean calculation with dplyr
I have a chunk of code that aggregates timestamps of a
How to create a single column from multiple?
I have df1: account score1 score2 score3 score4 score5 score6 random random2 23 f30 g1 g5 h10
in r dplyr select the first observation by group and create a list
I would like to select the first observation be group and create a list column. for example: df <- structure(list(grp =
Read in CSV files and Add a Column with File name
Assume you have 2 files as follows. file_1_october.csv file_2_november.csv the files have identical columns. so i
How to summarize a unique value of a numeric variable within dplyr
Data: structure(list(month_name = c("september", "september", "september", "september",
Counting and then summing string variable within specific time in long data frame
I have a dataset like this: structure(list(participant_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2), group = c(2, 2, 2, 2, 2,
Filtering by multiple columns at once in `dplyr`
Here is some sample data library(tidyverse) data <- matrix(runif(20), ncol = 4) colnames(data) <- c("mt100",
replace one column with another using regex matching in R
I am working with some survey data and i would like to replace the contents of one survey item/column with another survey item, while keeping
Ad
Create a rule condition based on count and dates
I'd like to create a rule condition based on count and dates, for this, i try: # package library(dplyr) # open data
Add sample size to data frame after aggregating using R
I have a data frame with plot plot numbers, and independently-taken data for 4 test subjects as shown below: data <-
How to create a Weighted Sum Score based on a second dataset for specific variables
I have to create a weighted sum score (wsum) based on several variables. for instance, mydata has three variables (a, b, and c). i
How best to do this pivot operation in R
Below is the sample data and the desired outcome. this is a much simplified version of the actual data set. in the actual data set, there are 20
Groupby one column and calculate lag difference of monthly, quarterly mixed data's current period values with previous one using R
Assuming i have a panel data as follows, which was edited from
Create new column based on existing columns whose names are stored in another column (dplyr)
Consider the following dataset: df <- tibble(v1 = 1:5, v2= 101:105, v3 = c("v1", "v2", "v1",
How do I calculate the difference between two dates in dplyr (in days)? R
I need to calculate the difference in days between two dates in r. the two dates are in different column and i just need to create another column
How to match corresponding values to part of string (before and after space)?
I have two dataframes, and want to add values from the 2nd one to the 1st one according to string values, but use partial string matching if there
Unexpected dplyr::bind_rows() behavior
Short version: i'm encountering an error with dplyr::bind_rows() which i don't understand. i want to split my data based on
Removing all text before a certain character for all variables in R
I need to remove all text before the last "/" in a variable's name for every variable in a data frame. suppose this is the data
Creating Crosstable with Multiple Variables Summarized by Row Categories
I am interested in summarizing several outcomes by sample categories and presenting it all in one table. something with output that resembles:
Ad
dplyr summarise character time variable
I have data that looks like this id time 456 0:00:01 456 0:02:05 123 0:00:14 756 0:03:47 756 0:01:56 756 0:00:01
Convert a list of vectors of different lengths to a data frame
Couldn't find any solution to this question online, but apologies if i missed it. i have a list of several vectors (all
Ad
Blog Categories
Ad