The staged.dependencies
package simplifies the development process for developing a set of inter-dependent R packages. In each repository of the set of R packages you are co-developing you should specify a staged_dependencies.yaml
file containing upstream (i.e. those packages your current repo depends on) and downstream (i.e. those packages which depend on your current repo’s package) dependency packages within your development set of packages.
The set of packages you are co-developing are internal dependencies. Other R packages they depend on are named external dependencies.
By using a git branching naming convention in your development work, staged.dependencies
makes it straightforward to:
RCMDCHECK_ARGS
and passes it as arguments to R CMD check
.The package also provides:
By default, the directory ~/.staged.dependencies
as well as a dummy config file are created whenever the package is loaded and the directory does not exist. This is where checked out repositories are cached. To use a different location, you can set the option staged.dependencies._storage_dir
before loading the package. Note that on Windows, the path ~/.staged.dependencies
may be a subdirectory of One Drive and this can lead to problems.
Note staged.dependencies requires a git signature setup that can be checked with git2r::default_signature(".")
. If this is not setup it can be created using git2r::config(git2r::repository("."), user.name = <<name>>, user.email = <<email>>)
or via the git cli using, for example git config --global user.email <email>
.
If your internal packages require personal access tokens then these must be available to staged.dependencies.
There are two approaches:
~/.Rprofile
file with the following:Sys.setenv("GITHUB_PAT" = "<token>") # if using https://github.com
Sys.setenv("GITLAB_PAT" = "<token>") # if using https://gitlab.com
options(
staged.dependencies.token_mapping = c(
"https://github.com" = "GITHUB_PAT",
"https://gitlab.com" = "GITLAB_PAT",
<<another_remote>> = <<token env variable>>
...
)
)
~/.Renviron
file,and add the following into your ~/.Rprofile
file:
The easiest way to use staged.dependencies
, once installed, is to checkout a repository and within the RStudio use the addins:
You can run the actions of the addins explicitly and change the default arguments (see function documentation for further details)
# create dependency_structure object from a checked out repo...
x <- dependency_table(project = "../stageddeps.electricity", verbose = 1)
# ... or directly from a remote
x <- dependency_table(project = "openpharma/stageddeps.electricity@https://github.com",
project_type = "repo@host",
ref = "main",
verbose = 1)
# print and plot it
print(x)
if (require("visNetwork")) plot(x)
#install upstream dependencies
install_deps(x, verbose = 1)
# check downstream packages
check_downstream(x, verbose = 1, check_args = c("--no-manual"))
Remember to restart the R session after installing packages that were already loaded. There are additional functions (e.g. a shiny app to view the dependency graph and choose which packages to install) in the package. See the function documentation for more details.
staged.dependencies
knows which branches to checkout for each of your projects due to a git branch convention. The default branch naming convention staged.dependencies
expects throughout your repositories is described by the examples below:
a@b@c@d@main
, staged.dependencies
checks for branches in each repository in the following order: a@b@c@d@main
, b@c@d@main
, c@d@main
, d@main
and main
.feature1
that involves repoB, repoC
with the dependency graph repoA --> repoB --> repoC
, where repoA
is an additional upstream dependency. One then notices that feature1
requires a fix in repoB
, so one creates a new branch fix1@feature1@main
on repoB
. The setup can be summarized as follows:repoA: devel
repoB: fix1@feature1@main, feature1@main, main
repoC: feature1@main, main
A PR on repoB fix1@feature1@main --> feature1@main
takes repoA: main, repoB: fix1@feature1@main, repoC: feature1@main
. This can be checked by setting ref = fix1@feature1@main
and running check_downstream
on either repoC
or check_downstream
on repoB
(which adds repoC
to its downstream dependencies). A PR on either repoB
or repoC
feature1@main --> main
takes repoA: main, repoB:feature1@main, repoC: feature1@main
, setting ref = feature1@main
.
By setting ref = <<tag_name>>
, staged.dependencies
will checkout each repository at the tagged commit (or by default main
if tag does not exist, though this can be overridden with the fallback
argument to dependency_table
). The check for tag name takes priority over the branch procedure described above.
If you are working locally on a collection of packages, you can modify the ~/.staged.dependencies/config.yaml
to automatically pick up the local packages that should be taken instead of their remote ones. The file ~/.staged.dependencies/config.yaml
describes the available options. For the current project, the local rather than remote version is always taken.
staged_dependencies.yaml
file
Given a feature and a git repository, the package determines the branch to checkout according to the branch naming convention described above. To get the upstream or downstream dependencies, it inspects the staged_dependencies.yaml
file. This file is of the form:
---
upstream_repos:
- repo: Roche/rtables
host: https://github.com
- repo: Roche/respectables
host: https://github.com
downstream_repos:
- repo: example1/repo1
host: https://github.example.com
- repo: example1/repo2
host: https://code.example.com
current_repo:
repo: openpharma/staged.dependencies
host: https://github.com
It contains all the information that is required to fetch the repos. The current_repo
lists the info to fetch the project itself. All three top level fields are required although upstream_repos
and downstream_repos
can be empty.
Each package is cached in a directory whose name is based on repo
and a hash of repo, host
. If it exists, it fetches. Otherwise, it clones. The repositories only have remote branches, the default tracking branch is deleted. Otherwise, there would be problems if the default branch is deleted from the remote, checked out locally, so git fetch --prune
would not remove it and the local branch would be among the available branches considered in determine_branch
.
If a repo is part of the local packages, it is not fetched, but instead copied to the cache dir. A commit is created with all files (including staged, unstaged and untracked, but excluding git-ignored) files. Given such a cache directory, the commit hash is added to the DESCRIPTION
file and the package is installed, unless the package is already installed with the same commit hash. When calling install_deps
and similar functions, the current project is always added to the local_repos
so that its local rather than remote version is installed. For this, the current_repo
field must be correct, so downstream dependencies do not fetch the remote version.
The staged_dependencies.yaml
files are only used to discover upstream and downstream packages. These are called internal packages. Then, the true dependency graph based on the DESCRIPTION
files is constructed. All listed upstream dependencies that are not internal are called external dependencies. There exists a function to check that the staged_dependencies.yaml
files are consistent: if package x
lists y
as a downstream package, then y
should list x
as an upstream package.
When testing the addins, note that they run in a separate R process, so they pick up the currently installed version of staged.dependencies
rather than the one loaded with devtools::load_all()
.
git2r::clone
may fail. Check that the git host is reachable (VPN may be needed) and that the access token has read privileges for the repositories.
When you run remotes::install_deps
for a package that was installed with this package, issues may arise because not all repositories are publicly accessible. Make sure to provide auth tokens.
Users of Macs may need to install the oskeyring
package.
For developing this package, we use renv
. If you set environment variables such as GITHUB_PAT
in your ~/.Rprofile
, they will not be available because the project .Rprofile
does not source the global ~/.Rprofile
. To enable this, put the following into ~/.Renviron
:
RENV_CONFIG_USER_PROFILE=TRUE
Make sure that ~/.Rprofile
does not set lib paths as this may interfere with renv
. Also see the renv doc.
This package can also be tested on a more complex setup of packages available at: https://github.com/openpharma/stageddeps.elecinfra https://github.com/openpharma/stageddeps.electricity https://github.com/openpharma/stageddeps.food https://github.com/openpharma/stageddeps.garden https://github.com/openpharma/stageddeps.house https://github.com/openpharma/stageddeps.water
You can check that the right branches of the packages are installed with:
stageddeps.house::get_house()
stageddeps.water::get_water()
stageddeps.garden::get_garden()
stageddeps.food::get_food()
stageddeps.elecinfra::get_elecinfra()
stageddeps.electricity::get_electricity()