Data Frame

Table of Contents

Overview

A kind of list but:

class(data.frame(1, 2, 3))
mode(data.frame(1, 2, 3))
[1] "data.frame"
[1] "list"

Reference

subset(x, subset, select, drop = FALSE, …) reference

For ordinary vectors
the result is simply x[subset & !is.na(subset)]
For data frames
  • subset argument works on the rows.
  • subset will be evaluated in the data frame, so columns can be referred to (by name) as variables in the expression.
select
expression, indicating columns to select from a data frame.
drop
passed on to [ indexing operator. (like x[r, vars, drop = drop])
subset(airquality, Temp > 80, select = c(Ozone, Temp))
subset(airquality, Day == 1, select = -Temp)
subset(airquality, select = Ozone:Wind)

Split a dataset into train and test dataset howto

spam_idx = sample(nrow(spam), 1000)
spam_trn = spam[spam_idx, ]
spam_tst = spam[-spam_idx, ]