Installing and Using R Packages
Understanding R Packages
R packages are collections of R functions, data, and compiled code that extend the capabilities of base R. They are fundamental to the R ecosystem for several reasons:
- They allow you to access specialized functions and tools without writing them from scratch
- They provide access to cutting-edge statistical methods and visualizations
- They make it easier to perform complex analyses with well-tested, maintained code
- They help maintain reproducibility in your analysis
The R Package Ecosystem
R has two main types of packages:
- Base Packages: Come pre-installed with R (like
stats,graphics,utils) - Contributed Packages: Created by the R community and available on CRAN (Comprehensive R Archive Network)
Installing Packages
There are several ways to install R packages:
Using install.packages()
The most common way to install packages is using the install.packages() function:
# Install a single package
install.packages("tidyverse")
# Install multiple packages
install.packages(c("ggplot2", "dplyr", "readr"))
Essential Packages for this Workshop
Here are the key packages we’ll be using throughout this workshop:
# Install all essential packages
install.packages(c(
"tidyverse", # Collection of data science packages
"ggplot2", # For data visualization
"dplyr", # For data manipulation
"readr", # For reading data
"tidyr", # For tidying data
"rmarkdown", # For creating documents
"knitr" # For knitting Rmd files
))
Loading Packages
Once installed, packages need to be loaded into your R session before you can use them:
# Load individual packages
library(tidyverse)
# You can also load specific packages from tidyverse
library(ggplot2)
library(dplyr)
Checking Package Version
You can check which version of a package you have installed:
# Check package version
packageVersion("tidyverse")
Managing Packages
Updating Packages
It’s good practice to keep your packages up to date:
# Update all packages
update.packages()
# Update specific packages
install.packages("ggplot2") # reinstalling will update to the latest version
Viewing Installed Packages
You can see what packages are installed on your system:
# List installed packages
installed.packages()[, c("Package", "Version")]
Best Practices
- Document Dependencies: Always list the packages your analysis requires at the beginning of your script
- Version Control: Note the versions of key packages used in your analysis
- Load Packages Early: Load all required packages at the start of your script
- Use Specific Functions: When using functions from specific packages, you can use the
package::function()notation
Example Usage
Here’s a simple example using some common packages:
# Load required packages
library(ggplot2)
library(dplyr)
# Create some sample data
data <- data.frame(
x = 1:10,
y = rnorm(10)
)
# Use dplyr to manipulate data
data <- data %>%
mutate(z = y * 2)
# Create a plot with ggplot2
ggplot(data, aes(x = x, y = y)) +
geom_point() +
geom_line() +
theme_minimal() +
labs(title = "Sample Plot",
x = "X axis",
y = "Y axis")
Troubleshooting
Common issues when working with packages:
- Package not found: Make sure the package is installed
- Loading conflicts: Some functions might have the same name in different packages
- Dependencies: Some packages require other packages to be installed first
# Check if a package is installed
if (!require("ggplot2")) {
install.packages("ggplot2")
}
# See which packages are currently loaded
search()
# Get help with a package
help(package = "ggplot2")
Finding Help and Documentation
R packages come with extensive documentation that can help you understand how to use them effectively:
Package Documentation
# Get overview of a package
help(package = "dplyr")
# Get help for a specific function
?dplyr::filter
help("filter", package = "dplyr")
# See examples for a function
example("filter", package = "dplyr")
Package Vignettes
Vignettes are detailed guides that show you how to use the package:
# List all vignettes for installed packages
vignette()
# List vignettes for a specific package
vignette(package = "dplyr")
# Open a specific vignette
vignette("dplyr")
# If you know the specific vignette name
vignette("programming", package = "dplyr")
Online Resources
Many packages also have additional resources:
- Package websites: Most tidyverse packages have dedicated websites (e.g., dplyr.tidyverse.org)
- GitHub repositories: Source code and issue tracking
- Cheatsheets: Quick reference guides (Help -> Cheatsheets in RStudio)
# Open package documentation in browser (if available)
browseVignettes("dplyr")
# Get package news/changelog
news(package = "dplyr")
Next Steps
After understanding how to install and manage packages, you can:
- Explore package documentation using
?orhelp() - Look for packages that suit your needs on CRAN
- Learn about package development if you want to create your own packages