Introduction to Dataframes in R

By: Justin Millar

September 06, 2017 Meetup Presentation

Reading CSV datafiles into R

We often store our data in comma seperated value (CSV) files, which can be read into R using the read.csv() function:

# Download example .csv file
download.file("https://ndownloader.figshare.com/files/2292169",
              "data/portal_data_joined.csv")

# Save into variable 
surveys <- read.csv('data/portal_data_joined.csv')

Note: this code requires having a data/ folder in your project

Functions for characterizing dataframe

We can run the name of the variable to view the dataframe, but often there will be too much information to display in the console

Here are some useful functions for characterizing a dataframe:

head(surveys)     # Top of dataframe
tail(surveys)     # Bottom of dataframe
dim(surveys)      # Dimensions
ncol(surveys)     # Number of columns
nrow(surveys)     # Number of rows
names(surveys)    # Column names
rownames(surveys) # Row names
str(surveys)      # Structure, with class, length, and content
summary(surveys)  # Summary statistics for each columns

Challenge Exercise

What type of vectors are each of the columns in the surveys dataframe?

Indexing and subsetting dataframes

Dataframes are also subsetted or indexed with square brackets, expect we must specify rows then columns[row,column]:

surveys[1, 1]   # first element in the first column of the data frame (as a vector)
surveys[1, 6]   # first element in the 6th column (as a vector)
surveys[, 1]    # first column in the data frame (as a vector)
surveys[1]      # first column in the data frame (as a data.frame)
surveys[1:3, 7] # first three elements in the 7th column (as a vector)
surveys[3, ]    # the 3rd element for all columns (as a data.frame)
head_surveys <- surveys[1:6, ] # equivalent to head(surveys)

Use the - sign to exclude certain sections:

surveys[,-1]          # The whole data frame, except the first column
surveys[-c(7:34786),] # Equivalent to head(surveys)

Subsetting columns by name

R-Gators

Activity hub for R programming at the University of Florida

Introduction to Dataframes in R

Reading CSV datafiles into R

Functions for characterizing dataframe

Challenge Exercise

Indexing and subsetting dataframes

Subsetting columns by name

Challenge Exercise

Factors

Challenge Exercise

Plots in base R

Customizing plots

Challenge Exercise

Plot Types

Other types of plots

Exercise Challenge