r - Values of the wrong group are used when using plot() within a data.table() in RStudio -
i want generate divided diagram. on upper section of diagram values of group a
, on lower 1 values of group b
should used. using data.table()
this. here code used generate example , set graphical output:
library(data.table) set.seed(23) example <- data.table('group' = rep(c('a', 'b'), each = 5), 'value' = runif(10)) layout(1:2) par('mai' = rep(.5, 4))
when running following lines in usual r console correct values used plotting. when running same code in rstudio values of second group used both diagrams:
example[, plot(value, ylim = c(0, 1)), = group] # example 1 example[, .sd[plot(value, ylim = c(0, 1))], = group] # example 2
when adding comma in subset data.table .sd[]
of example 2 correct output generated in rstudio well:
example[, .sd[, plot(value, ylim = c(0, 1))], = group] # example 3
when using barplot()
rather plot()
rstudio uses correct values well:
example[, barplot(value, ylim = c(0, 1)), = group] # example 4
did overlook or bug?
system: windows 7, rstudio desktop v0.98.1091, r 3.1.2, data.table 1.9.4
nice catch (+1'd already)! in case, example 3 doesn't produce right plot (os x 10.10.1, r 3.1.2, rstudio 0.98.1091).
the difference between r console/gui , rstudio here plotting device. rstudio seems using native graphics device rstudiogd
, r console / gui uses quartz
.
by debugging graphics:::plot.default
able narrow down issue function plot.xy()
. function calls different graphics devices (as shown above).
by initiating, example, quartz
calling function quartz()
, running code works fine!
fwiw issue can reproduced using dplyr()
well:
require(dplyr) df = as.data.frame(example) my_fun = function(x) {plot(x, ylim=c(0,1)); 1l } df %>% group_by(group) %>% summarise(my_fun(value))
will result in same wrong plot.
this due way subgroups handled in data.table (and think dplyr
should doing same way data.table), can see by:
example[, print(sapply(.sd, address)), by=group] # value # "0x105bbf5b8" # value # "0x105bbf5b8" # empty data.table (0 rows) of 1 col: group
data.table
assigns largest group .sd
, internally reuses memory each subgroup avoid repetitive memory alloc/dealloc - efficiency. not sure (shooting in dark here), seems rstudiogd
doesn't let go of pointer linked subgroup, , data in subgroup gets updated, plot gets updated too. can verify doing:
# on rstudiogd debug(graphics:::plot.default) set.seed(23) example <- data.table('group' = rep(c('a', 'b'), each = 5), 'value' = runif(10)) layout(1:2) par('mai' = rep(.5, 4)) example[, plot(value, ylim = c(0, 1)), = group] # example 1 undebug(graphics:::plot.default)
keep hitting enter, , you'll see first plot plotted right.. , when second plot added, first plot changes well. may consequence of recent changes in rv3.1+ shallow copies function arguments rather deep copying (again, shooting in dark here).
you can temporarily fix explicitly copying value
:
example[, plot(copy(value), ylim = c(0, 1)), = group] # example 1
will produce right plot.
Comments
Post a Comment