Note: this script is available in full on github: github.com/nickmalleson/surf/blob/master/projects/extras/R_Python.Rmd

This is a small example of how to use R and Python together in the same R Markdown document

This uses the feather library that allows R and python to share data by writing binary files.

Initialise

library(feather) # To allow python and R to share data

Create some data with R

For this example we’ll just create a list of random numbers to be manipulated in R and Python (code frome here).

Make a data frame with two columns and 1000 rows.

data <- data.frame(replicate(2,sample(0:10,1000,rep=TRUE)))

par(mfrow=c(2,1))
hist(data$X1)
hist(data$X2)

Write the data using feather

Write out the data dataframe using Feather so that it can be read in and manipulated by Python.

write_feather(data, path="./data.feather")

Read and process the data in python

Now the neat bit - use python to read the data from R, do something with it (we’ll just make a new column) and then write it back out.

Note that in the R Markdown document, you can specify a particular python engine to use. E.g. the following would use an anaconda environment called ‘py35’:

{python engine.path="/Users/nick/anaconda2/envs/py35/bin/python"}

You’ll have to change that to match your own setup if you want to run this script.

Which ever python environment you use, you have to have the feather library installed. You can install it as normal, with something like:

pip install feather-format


import feather

# Read in the data frame
data = feather.read_dataframe("./data.feather")

# Create a new column by multiplying the first two columns:
data.loc[:,'X3'] = data.loc[:,'X1'] * data.loc[:,'X2']

# Write out the data using feather
feather.write_dataframe(data, "./data.feather")

Re-read the data in R

Now that we’ve processed it in python, re-read it using R.

new.data <- read_feather("./data.feather")

hist(new.data$X3)

That’s it!

OK that’s a very simple example, but the ability to use either language in the same R Markdown document, and to share data, is extremely powerful.