> R is somewhat slower
Actually it's MUCH slower, up to a point of being entirely not usable for very large datasets (even ~100GB). True, much of MATLAB speed comes from using highly optimized BLAS (Math Kernel Library by Intel). But not just it. R lacks JIT optimization and numerous attempts to add it were unsuccessful. In fact it's so bad that Ross Ihaka, one of R's creator, proposed to
"simply start over and build something better". See http://xianblog.wordpress.com/2010/09/13/simply-start-over-a...
They're both designed around completely in-memory arrays, which are passed around by-value with a copy-on-write scheme.
For R there is the bigmem package for mmapped arrays. And the "compiler" JIT packace is included since R 2.13.
I've seen that link before. See above re: one group's willingness to talk about the shortcomings versus another organization's preference to paper over it with marketing.
"They're both designed around completely in-memory arrays, which are passed around by-value with a copy-on-write scheme." True, but that doesn't invalidate my point. The datasets I'm typically working with are quite large 300GB-1TB (I have 2TB ram on my main server). I've tried both R and MATLAB and R has been a disaster. Even to plot say 10 million points on a graph is a pain.
TL;DR don't use R if you work with large datasets