How To Compare Each Row Of A Data Frame To Each Row Of Another Dataframe And Calculate Overlap
I have two dataframes with start and end time. I want to compare each row of df2 to each row of df1 and calculate the overlap.
df1 # start end #1 5 15 #2 20 28 #3 46 68 #4 80 87 df2 # start end #1 20 40 #2 65 85
So the results should be a vector with results
overlaping_duration_1= 8 (overlap from df2 row 1 with df1 row 1) overlaping_duration_2= 3+5 = overlap from df2 row 2 with df1 row 3 + overlap from df2 row 2 with df1 row 4
I tried it with an ifelse approach and cover the different conditions. This is only for the first row for df2.
overlap = ifelse ( df2$start <= df1$start & df1$start <= df2$end & df2$end <= df1$end, df2$end-df1$start, 0) overlap2 = ifelse ( df2$start <= df1$start & df1$end <= df2$end, df1$end-df1$start, 0) overlap3 = ifelse ( df1$start < df2$start & df2$end <= df1$end, df2$end-df2$start, 0) overlap4 = ifelse ( df1$start < df2$start & df2$start <= df1$end & df1$end <= df2$end, df1$end-df2$start, 0)
Afterwards the different overlap vectors can be merged. This could be applied over a
for loop over df2.
This approach is quite cumbersome. Is there a more comfortable way?
library(data.table) df1 = data.table(start=c(5,20,46,80),end=c(15,28,68,87)) df2 = data.table(start=c(20,65), end=c(40,85)) # add row identifer (`rn`) and dummy var (`id`) for cartesian join df1[,`:=`(id=1, rn=.I)] df2[,`:=`(id=1, rn=.I)] # do full join df = df1[df2,on="id", allow.cartesian=T] # estimate overlap, by row result = df[,overlap:=.(min(i.end,end)-max(i.start,start)), by=1:nrow(df)] # retain positive overlaps, and sum by df2 row number result[overlap>0, .(total = sum(overlap)), by = .(rn=i.rn)]
rn total 1: 1 8 2: 2 8
Update: You can also avoid the full join, by instead keying
end and using
library(data.table) df1 = data.table(start=c(5,20,46,80),end=c(15,28,68,87)) df2 = data.table(start=c(20,65), end=c(40,85)) setkey(df2,start,end) df = foverlaps(df1,df2[,rn:=.I], nomatch=NULL) df[,overlap:=.(min(i.end,end)-max(i.start,start)), by=1:nrow(df)][, .(total =sum(overlap,na.rm=T)),by=rn]
rn total 1: 1 8 2: 2 8
- → OctoberCMS Backend Loging Hash Error
- → "failed to open stream" error when executing "migrate:make"
- → OctoberCMS - How to make collapsible list default to active only on non-mobile
- → Create plugin that makes objects from model in back-end
- → October CMS Plugin Routes.php not registering
- → OctoberCMS Migrate Table
- → How to install console for plugin development in October CMS
- → OctoberCMS Rain User plugin not working or redirecting
- → October CMS Custom Mail Layout
- → October CMS - How to correctly route
- → October CMS create a multi select Form field
- → How to update data attribute on Ajax complete
- → October CMS - Conditionally Load a Different Page