R: Programming the R language

From MathWiki

A frequent problem encountered by users of R is that of generating models with a program. Models are expressed with a formula in the R language. We know how to manipulate ordinary character text with simple tools such as 'paste', 'sub', etc., but manipulating the language itself is less straightforward. There are pitfalls from the way some programs use the language in output, i.e. some functions such as print.summary.lm depend not only on the objects called in the 'lm' function but also on their names.

Here are two approaches to doing this.

Character manipulation

One approach is to use character tools (e.g. paste, sub, gsub) to create a command in a character string, say 'cmd'. The command can then be executed with

> eval( parse ( text = cmd ) )

The following example shows how to fit a model to a list of dependent variables.

   library(nlme)
 
   scales <- c(
       "sart3err",
       "sartokrt",
       "sartoksd",
       "sarprfrt",
       "sarprfsd",
       "sarpofrt",
       "sarpofsd")

   model <- "lme( SAR ~ Occasion, dd, random = list( SubID = pdDiag( form = ~ 1 + Occasion)), na.action = na.omit)"
   
   Ev <- function(from,to,model) eval( parse ( text = sub(from,to,model)))
   
   fits <- list()
   for ( nn in scales ) {
       fits[[nn]] <- Ev( "SAR", nn, model )
   }

Programming with the language

Bill Venables in a posting on the R list on August 1, 2006, offers a brilliantly simple approach along with a few simple tools to manipulate models in R. Other postings in the same thread explain how other attempts to solve the problem fail in various ways.

The idea behind Venables' approach is akin to writing a macro in R with terms that can be substituted in each invocation of the macro. Venables defines three short functions to manipulate the language and then illustrates their use.

We can create an 'expression' in R with the 'quote' command:

> Ex <- quote( x <- z )

creates the expression 'x <- z' and saves it in 'Ex' without evaluating it. We would like to change 'x' or 'z' to something else, either a different name or a value, and then invoke the resulting expression with 'eval'. If 'Ex' were a character string we could easily see how to change 'x' or 'z' to something else. Venables' simple tools simplify the task of doing the same thing with expressions. The function 'subst', defined by

> subst <- function(Command, ...) do.call("substitute", list( Command, list(...)))

allows easy substitution as in the following examples:

> Ex
x <- z
> subst( Ex, z = 1)
x <- 1
> subst( Ex, x = as.name("x.2"))
x.2 <- z

To illustrate the differences among a reference to an object, its name in a character string and its name as a name in the R language, consider the following:

> subst( Ex, x = xx)
Error in do.call("substitute", list(Command, list(...))) : 
       object "xx" not found

> subst( Ex, x = "xx")
"xx" <- z
 
> subst( Ex, x = as.name("xx"))
xx <- z

> subst( Ex, x = as.name("xx"), z = 1)
xx <- 1

> val <- 44
> subst( Ex, x = as.name("xx"), z = val)
xx <- 44

> subst( Ex, x = as.name("xx"), z = "val")
xx <- "val"

> subst( Ex, x = as.name("xx"), z = as.name("val"))
xx <- val

Venables also offers a convenient tool to easily generate names as a concatenation of characters and values:

> abut <- 
+  function(...)  ## jam things tightly together
+   do.call("paste", c(lapply(list(...), as.character), sep = ""))

> Name <-
+  function(...) as.name(do.call("abut", list(...)))

Using 'Name' we could generate names in a loop. e.g.

> Ex <- quote( x <- z^i)
> subst( Ex, x = Name("x.", 2))
x.2 <- z^i

Thus:

> z <- 1:10
> for ( ii in 1:4 ) eval( subst( Ex, x = Name("x.", ii), i = ii))

will store powers of 'z' in 'x.1', 'x.2', etc.

Questions

  1. We've seen how to substitute a new name or a value. Is it possible to substitute a general expression?
  2. How can these techniques be applied to manipulating formulas in a 'lm' call?