Internal r&d and contract r&d are the two basic forms of r&d in organizations.

An R package can be viewed as a set of functions, of which only a part are exposed to the user. In this blog post we shall concentrate of the functions that are not exposed to the user, so called internal functions: what are they, how does one handle them in one’s own package, and how can one explore them?

Internal functions 101

What is an internal function?

It’s a function that lives in your package, but that isn’t surfaced to the user. You could also call it unexported function or helper function; as opposed to exported functions and user-facing functions.

For instance, in the usethis package there’s a

## Error in base_and_recommended[]: could not find function "base_and_recommended"
6 function that is not exported.

# doesn't work
library["usethis"]
base_and_recommended[]

## Error in base_and_recommended[]: could not find function "base_and_recommended"

usethis::base_and_recommended[]

## Error: 'base_and_recommended' is not an exported object from 'namespace:usethis'

# works
usethis:::base_and_recommended[]

##  [1] "base"       "boot"       "class"      "cluster"    "codetools" 
##  [6] "compiler"   "datasets"   "foreign"    "graphics"   "grDevices" 
## [11] "grid"       "KernSmooth" "lattice"    "MASS"       "Matrix"    
## [16] "methods"    "mgcv"       "nlme"       "nnet"       "parallel"  
## [21] "rpart"      "spatial"    "splines"    "stats"      "stats4"    
## [26] "survival"   "tcltk"      "tools"      "utils"

As an user, you shouldn’t use unexported functions of another package in your own code.

Why not export all functions?

There are at least these two reasons:

  • In a package you want to provide your user an API that is useful and stable. You can vouch for a few functions, that serve the package main goals, are documented enough, and that you’d only change with great care if need be. If your package users rely on an internal function that you decide to ditch when re-factoring code, they won’t be happy, so only export what you want to maintain.

  • If all packages exposed all their internal functions, the user environment would be flooded and the namespace conflicts would be out of control.

Why write internal functions?

Why write internal functions instead of having everything in one block of code inside each exported functions?

When writing R code in general there are several reasons to write functions and it is the same within R packages: you can re-use a bit of code in several places [e.g. an epoch converter used for the output of several endpoints from a web API], and you can give it a self-explaining name [e.g.

## Error in base_and_recommended[]: could not find function "base_and_recommended"
7]. Any function defined in your package is usable by other functions of your package [unless it is defined inside a function of your package, in which case only that parent function can use it].

Having internal functions also means you can test these bits of code on their own. That said if you test internals too much re-factoring your code will mean breaking tests.

To find blocks of code that could be replaced with a function used several times, you could use the

## Error in base_and_recommended[]: could not find function "base_and_recommended"
8 package whose planned enhancements include highlighting or printing the similar blocks.

When not to write internal functions?

There is a balance to be found between writing your own helpers for everything and only depending on external code. You can watch this excellent code on the topic.

Where to put internal functions?

You could save internal functions used in one function only in the R file defining that function, and internal functions used in several other functions in a single utils.R file or specialized utils-dates.R, utils-encoding.R files. Choose a system that helps you and your collaborators find the internal functions easily, R will never have trouble finding them as long they’re somewhere in the R/ directory. 😉

Another possible approach to helper functions when used in several packages is to pack them up in a package such as Yihui Xie’s

## Error in base_and_recommended[]: could not find function "base_and_recommended"
9. So then they’re no longer internal functions. 😵

How to document internal functions?

You should at least add a few comments in their code as usual. Best practice recommended in the tidyverse style guide and the rOpenSci dev guide is to document them with roxygen2 tags like other functions, but to use

usethis::base_and_recommended[]
0 to prevent manual pages to be created.

#' Compare x to 1
#' @param x an integer
#' @noRd
is_one 

Chủ Đề