= information object_name
Objects and functions
R revolves around objects and functions. Generally speaking, an object is a container for information, and a function is a piece of code that performs a specific task. We often use functions to manipulate or create objects.
R objects
In R, an object is information that we store in R’s memory. In order to define an R object, we need to choose a name and tell R what information the object should contain. The general syntax for defining objects looks like this:
We write the name of our new object to the left of an equal sign and define the information the object should contain to its right. Instead of using an equal sign to define an object, we can also draw an arrow using the less than sign and a hyphen, which looks like this:
<- information object_name
Using the arrow instead of an equal sign dates back to an old programming language that R is derived from (APL). Whether you define objects using arrows or equal signs has no effect on what R does. It is simply a matter of preference.
As soon as we have defined an object (and assuming that we did everything correctly), our new object will appear in the Environment tab of RStudio’s Memory section (top right panel).
We will now cover the most important of the basic R objects, namely:
- single values
- vectors
- matrices
We will later turn toward two of the more complex R objects that we are very likely to encounter frequently when working with R:
- data frames
- lists
Functions in R
R functions are predefined pieces of code that we can call by writing the function name and telling R a number of function arguments. Function arguments are pieces of information a function needs to know in order to perform its task (very rarely, a function may not need any argument to perform its task). The general syntax for calling a function looks like this:
function_name(argument1, argument2, ...)
That is, we first write the function’s name, and then define all required arguments in parentheses. If we do that correctly, the function will perform its task. Depending on the function, we will see an output in the console.
We will be using two functions below:
The function
c()
, which we will use to create vectors. The arguments the functionc()
requires are the elements we want to combine into a vector.The function
matrix()
, which creates a matrix from a vector. This functions requires a vector, the number of rows and the number of columns as arguments.
Core fact 1: Functions can have a lot of arguments, but it is not always necessary to specify all of them. Most functions have default values for some of their arguments. If we do not specify these arguments, R will just run the function as if we had entered the default value of the argument. If we do specify arguments with default values, we simply override the default.
Core fact 2: It is impossible (for most people at least) to remember all functions and which arguments they require. Therefore, R contains a lifesaver in the form of a special function called help()
. The function help()
requires the name of a topic
(including but not limited to functions) as its main argument. Calling help()
with the name of a function will show the documentation for that function in the help tab of the Utility & Help section of RStudio’s interface (bottom right).
Object types
Single values, vectors, and matrices can vary by type. The type of these objects refers to the nature of its content, for example whether it contains numbers, character strings, logical arguments etc. The most common types of types are:
- double: this means numbers with decimal points (although the decimals must not necessarily be displayed)
- integer: whole numbers that can be negative positive or zero (rarely used because double type numbers are more practical)
- character: a character string, which is essentially a sequence of text symbols (these symbols can include numbers, but these are interpreted as being part of the text and not as numbers)
- logical: a logical TRUE or FALSE, also called a boolean value
Single values
Single values are the simplest type of objects we can define in R. As their name suggests, they represent a single value, for example, the number 7. We could tell R to create a single value called \(a\) that consists of the number 7. We do this as follows:
= 7 a
Alternatively, we can use the arrow notation to define the object.
<- 7 a
Once we run either line of code, we should notice that our Environment now contains the object a. There, we can also see that a is the number 7.
We can make R show us an object by running its name as code (either via the console or a script). If we enter the name of our single value a as code, we will see the following output.
[1] 7
R tells us “7”, which is exactly the value that we specified as a. The “[1]” in front of the value “7” is not important at this point. It merely tells us that the “7” is the first (and only) element of our single value.
Vectors
Vectors are objects that contain multiple values of the same type. Vectors cannot contain elements of different types. We can define vectors using the function c()
with the function’s arguments being the values we want to combine into the respective vector. The syntax looks as follow:
= c(1,2,3) # This line of code creates a double vector
v1 # called "v1" containing the numbers 1 to 3.
= c('hello', 'purple', '11!') # This creates a character vector
v2 # called "v2" containing three strings.
After executing this code, our Environment contains two more objects, v1 and v2. The vector v1 is denoted as a numeric vector (num) whereas v2 is labeled as a character vector (chr).
Just as with the single value a, we can have a look at our vectors by entering their names as code. For example, we can enter v2 as code.
[1] "hello" "purple" "11!"
We now see the three character strings that v2 consists of in the console. Again, there is a [1] at the front of the line. It indicates that “hello” is the first element of v2. If the vector was long enough to occupy multiple rows in the console, each line would start with a number in brackets, indicating which element of the vector is displayed at the beginning of that line.
Fun fact: We can think of single values as vectors of length 1.
Caveat: If we try to create a vector containing different types of elements, R will not complain (i.e., we do not get an error message)! Instead, R will simply transform some of the elements to ensure that all elements share the same type.
For example, if at least one of the elements we want to include in the vector is a character string, R will transform all other element types to character strings and create a character vector.
If we try to combine numeric (double or integer) and logical values into one vector, R will interpret “FALSE” as 0 and “TRUE” as 1.
Matrices
Matrices are the two-dimensional cousins of vectors. They have \(r\) rows and \(c\) columns. Per convention, the number of rows is stated before the number of columns, that is, a 4x3 matrix has 4 rows and 3 columns (think roman-catholic as a mnemonic aid).
As with vectors, all elements of a matrix must share the same type (if we try to combine different types of elements, R will just transform some of the elements to create a homogeneous matrix). In fact, matrices in R are vectors that we break down into rows and columns. To create a matrix, we need to call the function matrix()
and tell R three arguments: the vector we want to make into a matrix, the number of its rows, and the number of its columns. The syntax looks as follows:
= matrix( # this tells R that we want to create a matrix
matrix1 c(1,2,3,4), # defines the vector that will turn into a matrix
nrow = 2, # this tells R that our matrix should have 2 rows
ncol = 2) # this tells R that our matrix should have 2 columns
Executing the code above creates a numerical matrix in our Environment. Other than with single values and vectors, we can click on the name of our matrix in the Environment to view how it looks like. Alternatively, we can enter the name of our matrix as code and have R print it in the console.
[,1] [,2]
[1,] 1 3
[2,] 2 4
R is convenient in the sense that it is technically sufficient to specify either the number of rows or the number of columns. R will just infer the missing piece of information. For example, if we want to turn a vector of length 6 into a matrix with 2 columns, R will know that the matrix must have 3 rows.
Naming objects in R
In order to define an object in R, we need to give it a name and then tell R how the object should look like. Generally speaking, we can get very creative when naming our objects, but there are a few rules we need to keep in mind.
Object names must not start with numbers. The name “item2” is valid, but “2nd” is not (it will return a error message complaining about an “unexpected symbol”).
Object names may not contain special characters with the exception of underscores and periods. For example, the names “item_2” or “item.2” are valid. Using other special character will often yield error messages because R interprets these characters as operators (e.g., “*” indicates multiplication, “&” is a logical conjunction, etc.).
Object names must not contain spaces. While “item2” is a valid name, “item 2” is not.
Not being allowed to use spaces in object names might prevent us from assigning meaningful and intelligible object names. To deal with the issue, there are three common types of notation:
- period notation: place periods where you would have placed a space, e.g. trial.number
- underscore notation: replace spaces with underscores, e.g. trial_number
- camelBack notation: instead of spaces write everything as one word, but write the beginning of new words with a capital letter, e.g. trialNumber
Which notation you use is purely a matter of preference. You will likely encounter all of them when using R because the people who created base R or additional packages each have their own preferences regarding notation.