r - Values of the wrong group are used when using plot() within a data.table() in RStudio -


i want generate divided diagram. on upper section of diagram values of group a, on lower 1 values of group b should used. using data.table() this. here code used generate example , set graphical output:

library(data.table) set.seed(23) example <- data.table('group' = rep(c('a', 'b'), each = 5), 'value' = runif(10)) layout(1:2) par('mai' = rep(.5, 4)) 

when running following lines in usual r console correct values used plotting. when running same code in rstudio values of second group used both diagrams:

example[, plot(value, ylim = c(0, 1)), = group] # example 1 example[, .sd[plot(value, ylim = c(0, 1))], = group] # example 2 

when adding comma in subset data.table .sd[] of example 2 correct output generated in rstudio well:

example[, .sd[, plot(value, ylim = c(0, 1))], = group] # example 3 

when using barplot() rather plot() rstudio uses correct values well:

example[, barplot(value, ylim = c(0, 1)), = group] # example 4 

did overlook or bug?

system: windows 7, rstudio desktop v0.98.1091, r 3.1.2, data.table 1.9.4

nice catch (+1'd already)! in case, example 3 doesn't produce right plot (os x 10.10.1, r 3.1.2, rstudio 0.98.1091).

the difference between r console/gui , rstudio here plotting device. rstudio seems using native graphics device rstudiogd, r console / gui uses quartz.

by debugging graphics:::plot.default able narrow down issue function plot.xy(). function calls different graphics devices (as shown above).

by initiating, example, quartz calling function quartz() , running code works fine!

fwiw issue can reproduced using dplyr() well:

require(dplyr) df = as.data.frame(example) my_fun = function(x) {plot(x, ylim=c(0,1)); 1l } df %>% group_by(group) %>% summarise(my_fun(value)) 

will result in same wrong plot.

this due way subgroups handled in data.table (and think dplyr should doing same way data.table), can see by:

example[, print(sapply(.sd, address)), by=group] #         value  # "0x105bbf5b8"  #         value  # "0x105bbf5b8"  # empty data.table (0 rows) of 1 col: group 

data.table assigns largest group .sd , internally reuses memory each subgroup avoid repetitive memory alloc/dealloc - efficiency. not sure (shooting in dark here), seems rstudiogd doesn't let go of pointer linked subgroup, , data in subgroup gets updated, plot gets updated too. can verify doing:

# on rstudiogd debug(graphics:::plot.default) set.seed(23) example <- data.table('group' = rep(c('a', 'b'), each = 5), 'value' = runif(10)) layout(1:2) par('mai' = rep(.5, 4)) example[, plot(value, ylim = c(0, 1)), = group] # example 1 undebug(graphics:::plot.default) 

keep hitting enter, , you'll see first plot plotted right.. , when second plot added, first plot changes well. may consequence of recent changes in rv3.1+ shallow copies function arguments rather deep copying (again, shooting in dark here).

you can temporarily fix explicitly copying value:

example[, plot(copy(value), ylim = c(0, 1)), = group] # example 1 

will produce right plot.


Comments

Popular posts from this blog

java - Plugin org.apache.maven.plugins:maven-install-plugin:2.4 or one of its dependencies could not be resolved -

Round ImageView Android -

How can I utilize Yahoo Weather API in android -