Performances

Optimizing for performances

Memory usage

R objects live in the R memory space, their size unbeknown to Python, and because of that it seems that Python does not always garbage collect often enough when large objects are involved. This is sometimes leading to transient increased memory usage when large objects are overwritten in loops, and although reaching a system’s memory limit appears to trigger garbage collection, one may wish to explicitly trigger the collection.

import gc
gc.collect()

As a concrete example, consider the code below. This has been used somewhere a unique benchmark Python-to-R bridge, unfortunately without considering specificities of the Python and R respective garbage collection mechanisms. The outcome of the benchmark changes dramatically, probably putting back rpy2 as the fastest, most memory efficient, and most versatile Python-to-R bridge.

import rpy2.robjects
import gc

r = rpy2.robjects.r

r("a <- NULL")
for i in range(20):
    rcode = "a <- rbind(a, seq(1000000) * 1.0 * %d)" % i
    r(rcode)
    print r("sum(a)")
    # explicit garbage collection
    gc.collect()

Low-level interface

The high-level layer rpy2.robjects brings a lot of convenience, such a class mappings and interfaces, but obviously with a cost in term of performances. This cost is neglibible for common usage, but compute-intensive programms traversing the Python-to-R bridge way and back a very large number of time will notice it.

For those cases, the rpy2.rinterface low-level layer gets the programmer closer to R’s C-level interface, bring rpy2 faster than R code itself, as shown below.

A simple benchmark

As a simple benchmark, we took a function that would sum up all elements in a numerical vector.

pure R:

function(x) {
  total <- 0
  for (elt in x) {
    total <- total + elt
  }
  return(total)
}

pure Python:

def py_sum(x):
    total = 0
    for elt in x:
        total += elt
    return total

We ran this function over different types of sequences (same length, same values)

n = 20000
x_list = [random.random() for i in xrange(n)]

import array
x_array = array.array('f', x_list)

import numpy
x_numpy = numpy.array(x_list, 'f')

import rpy2.robjects as ro
x_floatvector = ro.FloatVector(x_list)
x_sexpvector = ro.rinterface.SexpVector(x_floatvector)

All results are made relative to the implementation in pure R, with the column speedup indicating how many times faster the code runs.

Function Sequence Speedup
pure R   1
pure python rpy2.rinterface.SexpVector 6.8
pure python rpy2.robjects.vectors.FloatVector 0.6
pure python list 9.1
pure python array.array 8.8
pure python numpy.array 1.2

Iterating through a list is likely the fastest, explaining why implementation of the sum in pure Python is the fastest. Note that the iterating sum is 9 times faster in Python than in R.

The object one iterates through matters much for the speed, and the poorest performer is rpy2.robjects.vectors.FloatVector, being almost twice slower than R. This is expected since the iteration relies on R-level mechanisms to which a penalty for using a higher-level interface must be added. On the other hand, using a rpy2.rinterface.SexpVector provides an almost 7x speedup, making the use of R through rpy2 faster that using R from R. This was again expected, as the lower-level interface is closer to the C API for R.

More of a surprise, iterating through a numpy.array is only slightly faster than pure R.

Using the popular bytecode optimizer psyco, we run again our benchmark function.

psyco:

import psyco

psy_sum = psyco.proxy(py_sum)
Function Sequence Speedup
psyco rpy2.rinterface.SexpVector 14.4
psyco rpy2.robjects.vectors.FloatVector 0.6
psyco list 27.1
psyco array.array 19.4
psyco numpy.array 1.5

When using psyco, we can achieve a 14x speed when looping over an R vector (the vector is in the R memory space) and summing its elements from rpy2, compared to doing the same operation in pure R.

Finally, and to put the earlier benchmarks in perspective, it is fair to note that python and R have a builtin function sum, calling C-compiled code, and to compare their performances.

Function Sequence Speedup
builtin python rpy2.rinterface.SexpVector 14.9
builtin python rpy2.robjects.vectors.FloatVector 0.6
builtin python list 32.7
builtin python array.array 26.1
builtin python numpy.array 1.3
builtin R   133.2
numpy.array.sum numpy.array 272.2

The builtin python implementation on list is only twice faster than a pure python implementation on an rpy2.rinterface.SexpVector, accelerated using psyco.

numpy.array.sum is about twice faster than its R conterpart, although it is important to remember that the R version handles missing values.

Table Of Contents

Previous topic

Related projects

Next topic

Appendix

This Page