Managing Private R Packages with Packrat and MiniCRAN

Sep 24, 2017 · 687 words · 4 minutes read minicranpackage managementpackratR

Package management in R can be difficult at times. In 2013 RStudio released packrat, a reproducible package management tool. packrat creates a list of precise package versions (with their dependencies) and stores the packages and sources in a subdirectory. When you share your project, it’ll contain all the dependencies needed to run the code.

If your project is just a data analysis, packrat probably works great for you because your dependencies are minimal and hosted in public repositories. However many data science projects use private packages.

packrat Problems

packrat tries really hard to install packages for you and uses devtools to install packages from non-CRAN repositories, such as GitHub or Bitbucket.

Although devtools has supported installing packages from private GitHub repositories using a personal access token (PAT) since v1.5 (Apr 7, 2014), it still does not support Bitbucket PATs.

This is trivial during interactive sessions where you can simply input your Bitbucket credentials to install_bitbucket(..., auth_user = <username>, password = <password>). However it becomes problematic if you try to use packrat for your project.

Unfortunately there isn’t a way to provide your non-GitHub repository credentials to devtools when using packrat. Without being able to authenticate, packrat fails to install private packages.

miniCRAN is a package from Revolution Analytics that facilitates creation of a private CRAN-like mirror. We can use miniCRAN to create our own repository that hosts our private packages and point packrat to use it.

miniCRAN

Start by installing miniCRAN on the system where you want to host your private repository.

# install.packages("devtools")
devtools::install_github("RevolutionAnalytics/miniCRAN")

Initialization

If there are packages you want to be available, specify them here as a character vector to the pkgs argument. I specify a nonexistant package, _. This causes miniCRAN to setup up the correct directory structure, but doesn’t waste time actually downloading a package. Alternatively you can create the directories yourself.

dir.create(path = "/path/<repository name>")
makeRepo(
    pkgs = "_",
    path = "/path/<repository name>/",
    type = c("source", "mac.binary", "win.binary"),
    Rversion = getRversion()
)

At this point the repository should be initalized with this setup:

$ tree
.
├── bin
│   ├── macosx
│   │   └── contrib
│   │       └── 3.4
│   └── windows
│       └── contrib
│           └── 3.4
└── src
    └── contrib

9 directories, 0 files

If you passed package names to the pkgs argument, you should see their tar.gz files under src/contrib.

The next step is to add your own private packages.

Add Your Private Packages

Edit the package DESCRIPTION file, add your repository name:

Repository: <repository name>

Then R CMD build <package name> and use addLocalPackage to copy the tar.gz to <package repository>/src/contrib.

addLocalPackage(
    pkgs = "<package name>",
    pkgPath = "/path",
    path = "/path/<repository name>")

The repository directory now has the package files in src/contrib.

.
├── bin
│   ├── macosx
│   │   └── contrib
│   │       └── 3.4
│   └── windows
│       └── contrib
│           └── 3.4
└── src
    └── contrib
        ├── PACKAGES
        ├── PACKAGES.gz
        ├── PACKAGES.rds
        └── <package_name>_1.0.tar.gz

9 directories, 4 files

Installing Packages From The Repository

Now that the package is part of the new repository, we need to tell R about the repository. You can do this by adding the repository URI to options(repos = ...). If you want to permanently add your private repository, you can edit your .Rprofile.

# If your repository path is local use file://
options(repos = c(getOption("repos"),
                  "<repository name>" = "file://path/<repository name>"))

Finally, open your packrat project and install your package from source.

install.packages("<package name>", type = "source")

Closing Thoughts

miniCRAN makes it easy to setup a private repository that packrat can use to install private packages. However there is still a bit of overhead. For example, you still need a way to keep the repository in sync with changes to the package source. Both GitHub and Bitbucket offer integrations for continuous deployment, so it wouldn’t be difficult to automatically upgrade package versions in the repository.

In many ways, we’re just putting more band-aids on the wound. Rather than doing so, it would be great if R had better package management at the project-level. However, with a little bit of work it’s not too difficult to put together a solution.