r - Get range of adjacent rows with the same value -
i have dataframe below. first column positions , last level. want output range of rows same number in adjacent rows. '2' should ignored. can help
the input:
1 3 10000 3 20000 3 30000 1 40000 2 50000 2 60000 2 70000 3 80000 1 90000 1 output 1- 2999 3 3000-3999 1 7000-7999 3 8000-9999 1
here's method using chaning functions of dplyr
. here's sample data
dd <- structure(list(pos = c(1l, 10000l, 20000l, 30000l, 40000l, 50000l, 60000l, 70000l, 80000l, 90000l), level = c(3l, 3l, 3l, 1l, 2l, 2l, 2l, 3l, 1l, 1l)), .names = c("pos", "level"), class = "data.frame", row.names = c(na, -10l)) dd <- dd[order(dd$pos), ] #make sure sorted position
if difference next pos 10000, can do
library(dplyr) dd %>% arrange(pos) %>% mutate(run=cumsum(c(0,diff(level))!=0)) %>% subset(level!=2) %>% group_by(run) %>% summarise(level=max(level), start=min(pos), end=max(pos)+9999) %>% select(-run) # level start end # 1 3 1 29999 # 2 1 30000 39999 # 3 3 70000 79999 # 4 1 80000 99999
otherwise
dd %>% arrange(pos) %>% mutate(run=cumsum(c(0,diff(level))!=0), nextpos=lead(pos)) %>% subset(level!=2) %>% group_by(run) %>% summarise(level=max(level), start=min(pos), end=max(nextpos)-1) %>% select(-run) # level start end # 1 3 1 29999 # 2 1 30000 39999 # 3 3 70000 79999 # 4 1 80000 na
can calculate distance next group last group.
Comments
Post a Comment