Package management in R can be difficult at times. In 2013 RStudio released packrat, a reproducible package management tool. packrat creates a list of precise package versions (with their dependencies) and stores the packages and sources in a subdirectory. When you share your project, it’ll contain all the dependencies needed to run the code.
If your project is just a data analysis, packrat probably works great for you because your dependencies are minimal and hosted in public repositories. However many data science projects use private packages.
packrat tries really hard to install packages for you and uses devtools to install packages from non-CRAN repositories, such as GitHub or Bitbucket.
Although devtools has supported installing packages from private GitHub repositories using a personal access token (PAT) since v1.5 (Apr 7, 2014), it still does not support Bitbucket PATs.
This is trivial during interactive sessions where you can simply input your Bitbucket credentials to
install_bitbucket(..., auth_user = <username>, password = <password>). However it becomes problematic if you try to use packrat for your project.
Unfortunately there isn’t a way to provide your non-GitHub repository credentials to devtools when using packrat. Without being able to authenticate, packrat fails to install private packages.
miniCRAN is a package from Revolution Analytics that facilitates creation of a private CRAN-like mirror. We can use miniCRAN to create our own repository that hosts our private packages and point packrat to use it.
Start by installing miniCRAN on the system where you want to host your private repository.
# install.packages("devtools") devtools::install_github("RevolutionAnalytics/miniCRAN")
If there are packages you want to be available, specify them here as a character vector to the
pkgs argument. I specify a nonexistant package,
_. This causes miniCRAN to setup up the correct directory structure, but doesn’t waste time actually downloading a package. Alternatively you can create the directories yourself.
dir.create(path = "/path/<repository name>") makeRepo( pkgs = "_", path = "/path/<repository name>/", type = c("source", "mac.binary", "win.binary"), Rversion = getRversion() )
At this point the repository should be initalized with this setup:
$ tree . ├── bin │ ├── macosx │ │ └── contrib │ │ └── 3.4 │ └── windows │ └── contrib │ └── 3.4 └── src └── contrib 9 directories, 0 files
If you passed package names to the
pkgs argument, you should see their
tar.gz files under
The next step is to add your own private packages.
Add Your Private Packages
Edit the package DESCRIPTION file, add your repository name:
Repository: <repository name>
R CMD build <package name> and use
addLocalPackage to copy the
addLocalPackage( pkgs = "<package name>", pkgPath = "/path", path = "/path/<repository name>")
The repository directory now has the package files in
. ├── bin │ ├── macosx │ │ └── contrib │ │ └── 3.4 │ └── windows │ └── contrib │ └── 3.4 └── src └── contrib ├── PACKAGES ├── PACKAGES.gz ├── PACKAGES.rds └── <package_name>_1.0.tar.gz 9 directories, 4 files
Installing Packages From The Repository
Now that the package is part of the new repository, we need to tell R about the repository. You can do this by adding the repository URI to
options(repos = ...). If you want to permanently add your private repository, you can edit your .Rprofile.
# If your repository path is local use file:// options(repos = c(getOption("repos"), "<repository name>" = "file://path/<repository name>"))
Finally, open your packrat project and install your package from source.
install.packages("<package name>", type = "source")
miniCRAN makes it easy to setup a private repository that packrat can use to install private packages. However there is still a bit of overhead. For example, you still need a way to keep the repository in sync with changes to the package source. Both GitHub and Bitbucket offer integrations for continuous deployment, so it wouldn’t be difficult to automatically upgrade package versions in the repository.
In many ways, we’re just putting more band-aids on the wound. Rather than doing so, it would be great if R had better package management at the project-level. However, with a little bit of work it’s not too difficult to put together a solution.