In R, we work with objects to store and manipulate data. Understanding how to create and work with objects is fundamental to using R effectively. This lesson will cover the basics of working with objects and variables in R, using examples relevant to medical and health sciences, including RNA sequencing analysis.
# Store a gene expression value as a numeric object
# Typical RNA-seq count data might be in this range
expression_level <- 1543.7 # Numeric value representing expression
# Store a gene name as a character (text) object
gene_name <- "BRCA1" # String value for breast cancer gene 1
# Store a boolean flag for differential expression
is_differentially_expressed <- TRUE # Logical value (TRUE/FALSE)
# Display all our created objects with labels
print("Expression level:") # Print label## [1] "Expression level:"
## [1] 1543.7
## [1] "Gene name:"
## [1] "BRCA1"
## [1] "Is differentially expressed?"
## [1] TRUE
In RStudio, you can type Alt + - (Windows/Linux) or
Option + - (Mac) to create the assignment operator
<- in one keystroke!
When naming objects in R, follow these guidelines:
✅ Good Names: - Use descriptive names:
read_count, expression_level - Start with
letters: gene_1, sample_id - Use underscores
for spaces: fold_change
❌ Avoid: - Starting with numbers:
2nd_replicate (invalid) - Using spaces:
gene expression (invalid) - Using special characters:
gene@1 (invalid) - Using R reserved words: if,
else, function
# RNA-seq example: Calculate fold change between treated and control samples
control_expression <- 100 # Control sample expression level
treated_expression <- 200 # Treated sample expression level
fold_change <- treated_expression / control_expression # Calculate ratio
print("Fold change:") # Print label## [1] "Fold change:"
## [1] 2
# Medical example: Calculate Body Mass Index (BMI)
weight_kg <- 70.5 # Patient weight in kilograms
height_m <- 1.75 # Patient height in meters
bmi <- weight_kg / (height_m^2) # BMI formula: weight/(height^2)
print("BMI calculation:") # Print label## [1] "BMI calculation:"
## [1] 23.02041
# Initialize read count from sequencing data
read_count <- 1000 # Initial number of reads
scaling_factor <- 1.5 # Factor to adjust for library size
read_count <- read_count * scaling_factor # Apply scaling
normalized_count <- read_count / 10 # Further normalization step
# Display results of calculations
print("Final read count:") # Show scaled count## [1] "Final read count:"
## [1] 1500
## [1] "Normalized count:"
## [1] 150
# Convert patient temperature from Celsius to Fahrenheit
temp_celsius <- 37.5 # Normal body temperature in Celsius
# Formula: (°C × 9/5) + 32
temp_fahrenheit <- (temp_celsius * 9/5) + 32 # Convert to Fahrenheit
# Display the conversion result with units
print("Temperature conversion:")## [1] "Temperature conversion:"
## [1] "37.5 °C = 99.5 °F"
# Create variables for sample metadata
sample_id <- "RNA_01" # Unique sample identifier
gene_name <- "TP53" # Target gene (tumor protein p53)
expression_value <- 2456 # Expression count for this gene
# Combine information into a readable message using paste()
message <- paste("Sample", sample_id, "shows", expression_value,
"counts for gene", gene_name)
print("Combined message:") # Print formatted message## [1] "Combined message:"
## [1] "Sample RNA_01 shows 2456 counts for gene TP53"
# R is case-sensitive - these are three different variables
expression <- 5000 # lowercase
Expression <- 6000 # Title case
EXPRESSION <- 7000 # uppercase
# Show how all three variables are distinct
print("Three different variables:")## [1] "Three different variables:"
## [1] 5000
## [1] 6000
## [1] 7000
## [1] "Original read depth:"
## [1] 1e+06
# Overwriting with new value
read_depth <- 1500000 # Changed to 1.5 million reads
print("New read depth:")## [1] "New read depth:"
## [1] 1500000
gene_name,
sample_id)ls() function to see all objects in your
environmentrm(object_name) to remove objects you no longer
need## [1] "Objects in environment:"
## [1] "adult_heights" "analysis"
## [3] "analyze_expression" "analyze_numbers"
## [5] "avg" "base_plot"
## [7] "bmi" "calculate_average"
## [9] "calculate_fold_change" "calculate_rectangle_area"
## [11] "calculate_sum_product" "calculate_tip"
## [13] "child_heights" "clinical_data"
## [15] "control" "control_expression"
## [17] "count" "counter"
## [19] "create_message" "create_sequence"
## [21] "create_student_report" "data"
## [23] "determine_grade" "df"
## [25] "df_from_csv" "double_it"
## [27] "element" "ends"
## [29] "experiment_data" "expression"
## [31] "Expression" "EXPRESSION"
## [33] "expression_level" "expression_means"
## [35] "expression_value" "expression_values"
## [37] "fahrenheit_to_celsius" "fibonacci"
## [39] "file" "first_row"
## [41] "fold_change" "fold_changes"
## [43] "fruit" "fruits"
## [45] "function_name" "gene_annotations"
## [47] "gene_data" "gene_data_info"
## [49] "gene_data_long" "gene_data_na"
## [51] "gene_data_split" "gene_data_united"
## [53] "gene_expression_df" "gene_name"
## [55] "gene_names" "gene_summary"
## [57] "grades" "greet_person"
## [59] "grouped_data" "heatmap_data"
## [61] "heatmap_data_scaled" "height_m"
## [63] "heights" "high_expression"
## [65] "high_scorers" "i"
## [67] "important_genes" "interesting_genes"
## [69] "is_differentially_expressed" "is_significant"
## [71] "j" "mat"
## [73] "matrix_data" "mean_expr"
## [75] "message" "new_gene"
## [77] "normalize_expression" "normalized_count"
## [79] "number_pairs" "numbers"
## [81] "p_val" "p_values"
## [83] "read_count" "read_depth"
## [85] "responders_over_50" "result"
## [87] "rmd_files" "row_1"
## [89] "row_maxes" "rows_2_4"
## [91] "sample_id" "scaling_factor"
## [93] "sd_expr" "selected_cols"
## [95] "sentence" "set1"
## [97] "set2" "sig_genes"
## [99] "sig_indices" "significant_genes"
## [101] "single_col_df" "square_number"
## [103] "squares" "squares_loop"
## [105] "squares_sapply" "standard_analysis"
## [107] "starts" "student_averages"
## [109] "student_scores" "students"
## [111] "subject_averages" "subset"
## [113] "sum" "temp_celsius"
## [115] "temp_fahrenheit" "test_result"
## [117] "test_scores" "treated_expression"
## [119] "treatment" "variable_heights"
## [121] "weight_kg" "word_lengths"
## [123] "words"
# Remove a specific object from environment
rm(expression) # Delete 'expression' variable
print("Objects after removing 'expression':")## [1] "Objects after removing 'expression':"
## [1] "adult_heights" "analysis"
## [3] "analyze_expression" "analyze_numbers"
## [5] "avg" "base_plot"
## [7] "bmi" "calculate_average"
## [9] "calculate_fold_change" "calculate_rectangle_area"
## [11] "calculate_sum_product" "calculate_tip"
## [13] "child_heights" "clinical_data"
## [15] "control" "control_expression"
## [17] "count" "counter"
## [19] "create_message" "create_sequence"
## [21] "create_student_report" "data"
## [23] "determine_grade" "df"
## [25] "df_from_csv" "double_it"
## [27] "element" "ends"
## [29] "experiment_data" "Expression"
## [31] "EXPRESSION" "expression_level"
## [33] "expression_means" "expression_value"
## [35] "expression_values" "fahrenheit_to_celsius"
## [37] "fibonacci" "file"
## [39] "first_row" "fold_change"
## [41] "fold_changes" "fruit"
## [43] "fruits" "function_name"
## [45] "gene_annotations" "gene_data"
## [47] "gene_data_info" "gene_data_long"
## [49] "gene_data_na" "gene_data_split"
## [51] "gene_data_united" "gene_expression_df"
## [53] "gene_name" "gene_names"
## [55] "gene_summary" "grades"
## [57] "greet_person" "grouped_data"
## [59] "heatmap_data" "heatmap_data_scaled"
## [61] "height_m" "heights"
## [63] "high_expression" "high_scorers"
## [65] "i" "important_genes"
## [67] "interesting_genes" "is_differentially_expressed"
## [69] "is_significant" "j"
## [71] "mat" "matrix_data"
## [73] "mean_expr" "message"
## [75] "new_gene" "normalize_expression"
## [77] "normalized_count" "number_pairs"
## [79] "numbers" "p_val"
## [81] "p_values" "read_count"
## [83] "read_depth" "responders_over_50"
## [85] "result" "rmd_files"
## [87] "row_1" "row_maxes"
## [89] "rows_2_4" "sample_id"
## [91] "scaling_factor" "sd_expr"
## [93] "selected_cols" "sentence"
## [95] "set1" "set2"
## [97] "sig_genes" "sig_indices"
## [99] "significant_genes" "single_col_df"
## [101] "square_number" "squares"
## [103] "squares_loop" "squares_sapply"
## [105] "standard_analysis" "starts"
## [107] "student_averages" "student_scores"
## [109] "students" "subject_averages"
## [111] "subset" "sum"
## [113] "temp_celsius" "temp_fahrenheit"
## [115] "test_result" "test_scores"
## [117] "treated_expression" "treatment"
## [119] "variable_heights" "weight_kg"
## [121] "word_lengths" "words"
After mastering basic objects and variables, you can move on to: - Working with data vectors and matrices - Understanding data types and structures - Learning how to create and use functions - Working with data frames