Introduction to R and RStudio for Data Analysis in Scientific Research

Course convenor: Shaun Nielsen, MS (Biostatistics), PhD (Bioscience)

Course objective

The objective of this course is to equip participants with the fundamental skills and knowledge to confidently use R and RStudio for data analysis in scientific research. They will gain practical experience in using R for data manipulation, visualisation and analysis, empowering them to extract meaningful insights from their datasets. Throughout the course, learners will explore essential concepts such as importing data, data types and structures, transforming data, scripts and projects, R packages, using help documentation and troubleshooting and basic statistics. By the end of the course, participants will have a solid foundation in R, enabling them to continue in their own R journey efficiently and effectively. 

Who this course is for?

Students and academics using R for research

Prerequisites

No previous R knowledge

Material required

  • Computer with R and RStudio installed (further instructions will be give prior to the start of the course)

Material provided

  • Lecture notes covering core concepts
  • Coding exercises and example datasets

Course outline

The course will consist of 4 modules to be completed over 12 hours (3 hours over 4 days). The course will use the book ‘R for Data Science’ by Wickham and Grolemund (https://r4ds.hadley.nz/) and ‘Introductory Statistics with R’ by Peter Dalgaard as reference material.

Structure

  • Short lectures and coding demonstrations
  • Coding exercises
  • Discussion

Before the course begins

  • Instructions will be provided to install R, RStudio and various R packages before the course begins, and to verify everything is working correctly.

Module 1: Introduction to R

  • Overview of R and its applications 
  • RStudio layout and basic operations 
  • R packages 
  • Using ggplot2 to create a data visualisation 
  • Coding basics: variables, basic arithmetic operations, calling functions 
  • Scripts and RStudio projects 

Module 2: Data Manipulation with tidyverse

  • Importing data into R: reading CSV, Excel, and other common file formats 
  • Exporting data from R: writing data to CSV, Excel, and other formats
  • Manipulating data: selecting or creating columns, filtering rows, arranging data
  • Summarizing data: grouping and summarizing functions 
  • Reshaping data: wide and long transformations

Module 3: Data Visualization with ggplot2 and others

  • Basic plots: histograms, scatter plots, line plots, bar plots, boxplots 
  • Customizing plots: adding titles, labels, colours, and themes
  • Faceting: creating multiple plots based on subsets of data
  • Dendrograms and heatmaps
  • Exporting plots

Module 4: Introduction to Statistical Analysis with R

  • Descriptive statistics: mean, median, variance, standard deviation
  • Introduction to t-tests, linear regression, ANOVA
  • Introduction to multivariate analysis: distances, dendrograms, ordination and PERMANOVA

Fecha

  • Martes, 21 Mayo 2024

Horario

21, 22, 28 y 29 May

9:30 a 13.00 (12 hours)

Ubicación

Museo Nacional de Ciencias Naturales

José Gutierrez Abascal,2

Plazas

50 in-person places 20 in remote

Reservations mail 

mcnc104@mncn.csic.es

Technical information

snielsen@shaunnielsen.com

Tarifas

150 € (SAMNCN members and students 140€)

Inscripción