Introduction to R and RStudio for Data Analysis in Scientific Research
Course convenor: Shaun Nielsen, MS (Biostatistics), PhD (Bioscience)
Course objective
The objective of this course is to equip participants with the fundamental skills and knowledge to confidently use R and RStudio for data analysis in scientific research. They will gain practical experience in using R for data manipulation, visualisation and analysis, empowering them to extract meaningful insights from their datasets. Throughout the course, learners will explore essential concepts such as importing data, data types and structures, transforming data, scripts and projects, R packages, using help documentation and troubleshooting and basic statistics. By the end of the course, participants will have a solid foundation in R, enabling them to continue in their own R journey efficiently and effectively.
Who this course is for?
Students and academics using R for research
Prerequisites
No previous R knowledge
Material required
- Computer with R and RStudio installed (further instructions will be give prior to the start of the course)
Material provided
- Lecture notes covering core concepts
- Coding exercises and example datasets
Course outline
The course will consist of 4 modules to be completed over 12 hours (3 hours over 4 days). The course will use the book ‘R for Data Science’ by Wickham and Grolemund (https://r4ds.hadley.nz/) and ‘Introductory Statistics with R’ by Peter Dalgaard as reference material.
Structure
- Short lectures and coding demonstrations
- Coding exercises
- Discussion
Before the course begins
- Instructions will be provided to install R, RStudio and various R packages before the course begins, and to verify everything is working correctly.
Module 1: Introduction to R
- Overview of R and its applications
- RStudio layout and basic operations
- R packages
- Using ggplot2 to create a data visualisation
- Coding basics: variables, basic arithmetic operations, calling functions
- Scripts and RStudio projects
Module 2: Data Manipulation with tidyverse
- Importing data into R: reading CSV, Excel, and other common file formats
- Exporting data from R: writing data to CSV, Excel, and other formats
- Manipulating data: selecting or creating columns, filtering rows, arranging data
- Summarizing data: grouping and summarizing functions
- Reshaping data: wide and long transformations
Module 3: Data Visualization with ggplot2 and others
- Basic plots: histograms, scatter plots, line plots, bar plots, boxplots
- Customizing plots: adding titles, labels, colours, and themes
- Faceting: creating multiple plots based on subsets of data
- Dendrograms and heatmaps
- Exporting plots
Module 4: Introduction to Statistical Analysis with R
- Descriptive statistics: mean, median, variance, standard deviation
- Introduction to t-tests, linear regression, ANOVA
- Introduction to multivariate analysis: distances, dendrograms, ordination and PERMANOVA
Fecha
- Martes, 21 Mayo 2024
Horario
21, 22, 28 y 29 May
9:30 a 13.00 (12 hours)
Ubicación
Museo Nacional de Ciencias Naturales
José Gutierrez Abascal,2
Plazas
50 in-person places 20 in remote
Reservations mail
Technical information
Tarifas
150 € (SAMNCN members and students 140€)