# Obtain the current working directory
getwd()
Saving and loading data
At some point, we either want to load data into R or save a data frame that we have created or modified. Arguably, the best format to store data in tabular form (rows representing observations and columns representing variables) is the CSV (comma-separated variables) format.
CSV files are ideal for Open Data as most software can read them, including generic text editors.
In order to load and save data in CSV format, we can use two functions:
write.csv()
will save tabular data (usually a data frame, but can also be a matrix) as a CSV fileread.csv()
will load data from a CSV file
Saving data frames
If we save a data frame using the function write.csv()
, we should specify three function arguments:
x
: the name of the object we want to save (a data frame, but works with matrices, too)file
: the directory where we want to save the file plus the file name as a character stringrow.names
: whether we want R to assign row names (default is TRUE, but we want to turn that to FALSE)
If we do not specify a directory, R will save the data frame in the current working directory.
We can ask R to tell us the current working directory using the function getwd()
. This function is special in that it does not have any function arguments. Here is what the syntax looks like:
The output in the console will look slightly different for each person due to differences in how their folders are named:
[1] "C:/Users/Thomas/Documents/R stuff"
Here is how the code for saving a data frame could look like.
# save the object called my_data_frame
write.csv(
x = my_data_frame,
file = "C:/Users/Thomas/Documents/data.csv",
row.names = FALSE
)
Provided the directory “C:/Users/Thomas/Documents” already exists on the computer, there will now be a file named “data.csv” in that directory.
Loading data frames
We can load a data frame (or matrix) saved in CSV format using the function read.csv()
.
We usually need to specify only two function arguments:
file
: the directory and file name of the CSV fileheader
: a Boolean variable indicating whether the first row of file contains the variable names (default is TRUE)
Important: If we simply call the function, R will display the data frame we loaded in the console. If we want it to appear in the Environment we need to define it as a new object.
Here is what the code would look like if we were to load the data frame saved in the previous example.
# load data
= read.csv(
my_data_frame file = "C:/Users/Thomas/Documents/data.csv",
header = T)
Assuming that the requested file exists in the specified directory, the Environment will now show the object called “my_data_frame” in the Environment.