Model Formulae

Overview
poly(x, ..., degree = 1, raw = FALSE)
Create a formula from a string

Overview

The operator ~ is used to define a model formula in R.

response ~ op_1 term_1 op_2 term_2 op_3 term_3 …

response

is a vector or matrix, (or expression evaluating to a vector or matrix) defining the response variable(s).

op_i

is an operator, either + or -, implying the inclusion or exclusion of a term in the model, (the first is optional).

term_i

is either

a vector or matrix expression, or 1,
a factor, or
a formula expression consisting of factors, vectors or matrices connected by formula operators.

In all cases each term defines a collection of columns either to be added to or removed from the model matrix.

Notations:

Y ~ M: Y is modeled as M.
M_1 + M_2: Include M_1 and M_2.
M_1 - M_2: Include M_1 leaving out terms of M_2.
M_1 : M_2: The tensor product of M_1 and M_2. If both terms are factors, then the “subclasses” factor.
M_1 %in% M_2: Similar to M_1:M_2, but with a different coding.
M_1 * M_2: M_1 + M_2 + M_1:M_2.
M_1 / M_2: M_1 + M_2 %in% M_1.
M^n: All terms in M together with “interactions” up to order n
I(M): Insulate M. Inside M all operators have their normal arithmetic meaning, and that term appears in the model matrix.

https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Defining-statistical-models_003b-formulae

`poly(x, ..., degree = 1, raw = FALSE)` reference

sim = function(sample_size = 250) {
  x = runif(n = sample_size, min = -1, max = 1) * 2
  y = 3 + -6 * x ^ 2 + 1 * x ^ 4 + rnorm(n = sample_size, mean = 0, sd = 3)
  data.frame(x, y)
}
data = sim()
unname(coef(lm(y ~ x + I(x^2) + I(x^3) + I(x^4), data)))
unname(coef(lm(y ~ poly(x, degree = 4, raw = TRUE), data)))
unname(coef(lm(y ~ poly(x, degree = 4), data)))

[1]  2.7807038 -0.3471052 -5.8677261  0.1627440  0.9384000
[1]  2.7807038 -0.3471052 -5.8677261  0.1627440  0.9384000
[1]  -2.4457333   0.5099472 -50.5814792   4.0435950  18.0532313

poly() calculates an orthogonal polynomial by default.
x + I(x^2) + ... + I(x^n) is equivalent to poly(x, degree = n, raw = TRUE).

Create a formula from a string howto

as.formula("y ~ x1 + x2")

y ~ x1 + x2

Also, you can create a formula from a variable:

npred = 10
preds = paste("x", 1:npred, sep="", collapse = " + ")
as.formula(sprintf("y ~ %s", preds))

y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10

http://www.cookbook-r.com/Formulas/Creating_a_formula_from_a_string/

Table of Contents

Overview

poly(x, ..., degree = 1, raw = FALSE) reference

Create a formula from a string howto

`poly(x, ..., degree = 1, raw = FALSE)` reference