R: Speeding up R code

From MathWiki

Some notes on speeding up R code

From the R mailing list October 18, 2005:

>> Hi R-users:
>> Yesterday I ran a R code for 9 hours and it did not show any sign to
>> stop. Then I interrupted it and found it had completed 82.5%.
>> This morning I decided to wait for another 11 hours to see what will
>> happen. Wait a minute, I heard that transforming data.frame to matrix
>> will make R code faster. Then I made the modification in my R code.
>> Oooh, the new code finished within 30 minutes!!
>> Are there any other tips to speed up R program? Or someone could
>> indicate me some documents or websites on R code optimization?
>> #OS: Win XP, CPU: Pentium IV, 3.20G, Memory: 1G
>> #for() loop: 1000*1616*3*41, 3 data.frames (dim = c(1616,5), c(1616),
>> c(1616) respectively)

- As you found, indexing operations on matrices are much faster than on dataframes.

- Avoid growing allocations: calculate the size you need, then allocate it all at once.

- Vectorize calculations.

- Use Rprof() to identify where your code is spending its time, and concentrate your efforts on that area. Perhaps translate some essential routines into compiled C or Fortran.

- For a smaller improvement that might not suit your application, convert factors to their numeric codes.

- Break up long calculations into smaller pieces, so you can write out intermediate values. This doesn't necessarily speed it up, but it lets you stop and restart the calculation. It may also make it more suited to running on a cluster of computers instead of just one.

- Limit your use of memory so you don't end up using a swap file. Do this by only keeping objects that will be used later, removing others. (With the size of objects you were working with this may not be an issue.)

Duncan Murdoch

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html