There are many great things about the open nature but it also leads to a very diverse experience, depending on which subreddit and thread you visit. Discussions range from asinine to profound, and casual visitors are often quick to generalize their limited evaluation to the whole site.
It's amazing to think that R (or S) had data frames since the 70s and only now are other languages implementing them. There are some quirks of course, and pandas introduced some convenient features. But the R community has also provided its own improvements in the way of data.table, and now, dplyr.
Data.table package by matt dowle definitely deserves a mention! Its fast and I like the indexing functonalities it provides. The benchmark timings are pretty impressive.
I should have mentioned you (arun_sriniv) as the co-developer of data.table! Thanks for all the hard work.
And yes, memory usage will be interesting as that is the bottleneck when it comes to large dataset. I am working on something on those lines. Will post something soon :)
How about PyCharm or Eclipse+PyDev (I've personally heard more praise for the former)? I use emacs and ess or python-mode so can't comment on the IDEs too much but being able to use the same platform for both has been convenient for me.
I agree, I've read other gripes about R function documentation but it's one of the better ones for community software. Python's documentation seems focused on implementation from a programmer's perspective, but often not as helpful for actual application of the function.
I understood that at least a part of the larger programming community felt that the non-concurrent GC was limiting its future growth and shopped elsewhere for a production language.
In the R Help Desk 2004 (http://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf), Gabor Grothendieck recommends chron over POSIXct classes on account of the time zone conversions which occur when the tz attribute of the latter object is not "GMT". Will this not be a problem with lubridate? Thanks in advance.
I understand there have been studies showing people don't estimate angles as well as linear distances but people go too far out of their way to avoid pie charts when parts within whole data is shown. The 1D mosaic plot is effectively a stacked bar chart but there I suspect there is also some bias given toward the largest component such that the relative proportions of the smaller components are not well discriminated.
Neither pies nor mosaic charts should be used if you need exact readings of the data --- though mosaic plots do have the advantage of being rectangular, making segments much easier to label with values than the segments of pie charts.