Kopfbild TUM Mathematik-Fakultät





M4 - Chair of Math. Statistics


2009-02-18, Vinzenz Erhardt, Version 2.7
 
 
-  Create Packages for R -
- Manual -

 
 

"Rule number 1: read the manual!
 Rule number 2: start by learning rule number 1!"  ;)
 
 
 
General notes:

 
* This manual explains how to create an R package in Unix/Linux for Unix/Linux. A Windows version can be created by CRAN if the package is published on the CRAN website. Creating a Windows version yourself is very difficult and requires the installation of Perl and other software.
 
* There is a detailed description 'Writing R Extensions' on
http://cran.r-project.org/doc/manuals/R-exts.pdf

* All functions and variables must have names avoiding name conflicts with internal R functions. Even more: in variable names, '.' will be interpreted as a separator, i.e. a function called 'loglikelihood' is okay, 'log.likelihood' is not (due to 'log')!
 
* Whenever possible, avoid using global variables. If necessary, don't use allocation 'a <<- b'. Instead use:
assign("a",b,.GlobalEnv)
a <- get("a", pos=globalenv())
In general, for global variables, long and cryptic names are to be preferred since this will avoid accidental name conflicts with variables defined by the user. For example, rather use 'Xinput' instead of 'X'.
 
* A complete pdf manual for your package will automatically be created by CRAN as soon as your package is published online. If you don't wish to do this, a fragmented version will be created already when you run 'R CMD build...' (see below) in folder <packagename>.Rcheck.
 
 
 
Step-by-step Description:
 
1.) In your Unix/Linux console: create a folder in which your packages will be installed, for example myrlibrary:
mkdir mylibrary
OR:  find the R package repository already set in your R installation:
          start R
          type '.libPaths()' and remember the path(s) listed
          quit R
 
2.) Create a folder, in which your package will be generated, for instance 'myrpackages'. Change to this directory:
mkdir myrpackages
cd myrpackages
 
3.) Start R and compile all functions the package will use.
 
4.) In R: use function package.skeleton() in order to automatically generate the general folder and code framework for your package, for example
package.skeleton(list=c("<function1>","<function2>", ... ,"<functionn>"), name="<package.name>")
Here, '<function1>','<function2>', ... ,'<functionn>' are the required functions, '<package.name>' will be the name of the package.

5.) Now a directory <package.name> with subfolders has been created by 'package.skeleton()'. Change to <package.name>/man and fill in the manuals in the referring files. Be careful using LaTeX commands like '_' or '^'. These may be misinterpreted as LaTeX commands. Keywords: there is a distinct list of available keywords. Run 
file.show(file.path(R.home("doc"), "KEYWORDS"))
in R to see which ones you can use.
 
6.) Fill in <package.name>/DESCRIPTION. For license you may use: 'GPL (>= 3)'.

7.) In order to hide functions not directly accessed by the user, create a text file named  <package.name>/NAMESPACE (without extension *.txt) and add all functions which should be VISIBLE by
export(<function1>,<function2>)
Now only the functions '<function1>' and '<function2>' will be visible and may directly be used. All other functions can only be accessed by R.
Example (can be executed after the installation, i.e. step 10): 
library(<package.name>)
function1          # visible
function3          # invisible since not included in NAMESPACE
<package.name>:::function3   # make it visible
 
8.) Console: build the package:
R CMD build <package.name>
 
9.) Console: check the package:
R CMD check <package.name>
 
No warnings, errors, notes etc. may occur. If so, debug the code and go back to 8.). In order to have your package published on CRAN, it has to completely pass the check routine on the latest R version. Otherwise it will not be accepted.
 
10.) Install the package on your own system. Console:
R CMD INSTALL -l <TARGET DIRECTORY> package.tar.gz
(here <TARGET DIRECTORY> is your desired target directory from Step 1.)
 
 

Incorporating C code in your R code:

* Note that in C, the concept of pointers is used for variables. Search the web to find abounding tutorials on this. A variable 'v' is a pointer to an address in the RAM, whereas '&v' provides the address (which changes from session to session) and '*v' refers to the actual value.

* Use function type 'void' (a function returning nothing). The output we are interested in will be stored in the RAM. The output variable will also be an argument of the function (in our example it will be 'out').

* As function arguments only use pointers, therefore the arguments in the C function are classified with asterisks, i.e. use 'double*', 'int*' etc.

* If you introduce your own variables in the C function - for example by using 'int x;', clear the memory at the end of the code using 'free(x)'.

* Example: we want to incorporate a C function in R which adds two double numbers. Create a text file 'plus.c' and edit it.

* Minimal libraries in C for this purpose will be given in the beginning of the file. For our example, the code reads:
#include<R.h>
#include<Rmath.h>
#include<Rdefines.h>
#include<Rinternals.h>

void plus(double* a, double* b, double* out)
{
*out = *a+*b;
}

* Save the file and compile this function in the Console using
R CMD SHLIB plus.c
Errors in the C code will be reported. If the code is okay, two files 'plus.o' and 'plus.so' will be created in the directory.

* In R: load the compiled code using 'dyn.load' and use the routine '.C' as to access the function, i.e.
dyn.load("plus.so")
x <- 5.234
y <- 87.234
output <- double(1)
aplusb <- .C("plus",
as.double(x),
as.double(y),
as.double(output))[[3]]
aplusb

* In our case, the first two arguments are the input arguments, the third one is an argument for storing the output. Since we use the third argument as output, we refer to it by '[[3]]'. If the function we want to use is part of another package, we can use argument 'package' to specify this package (refer to the help for '.C' on how to do this).

* Since variable x and y are double, we provide the argument 'as.double'. In the C function, it will also be double. Corresponding types are:


R storage mode C type                        

logical int *

integer int *
            
double double *

complex Rcomplex *

character char **

raw unsigned char *

* One can also use '.Call' instead of '.C', where '.C' allows for more arguments. For details on the exact differences please refer to the help.

* More details can be found in the manual 'Writing R Extensions' available at
http://cran.r-project.org/doc/manuals/R-exts.pdf

* Embedding this code in an R package: after running 'package.skeleton()', simply create a subfolder 'src' in the folder <package.name>, i.e. in the directory, where also the folders 'man' and 'R' can be found. Copy the function with the C code (here 'plus.c') into this folder and continue with Step 5.).



Package upload to CRAN:
 
- Upload the .tar.gz file (according to the manual with 'anonymous' as login and E-mail address as password - seems not to be necessary, though), to
ftp://cran.R-project.org/incoming/ 
- Send an E-mail with a hint on the upload to cran@r-project.org, addressee is Kurt Hornik from Vienna.
 
 




Disclaimer und Rechtshinweise | Anregungen | Statistikberatung TUMStat der Technischen Universität München