IT News & Events

News about IT at Indiana University and the world

Menu
  • Monday, February 11, 2019
  • Monday, February 25, 2019

Supercomputing for Everyone Series: Intro to R for Biologists

This is a three-part workshop that will cover the basics of R. @IUPUI and @IUB

Event details

  • Date & time
    Monday, February 11, 2019
    1pm-4pm
  • Location
    Sciences Library Room 002, Chemistry Building @IUB and Main Library, Room UL 0110 @IUPUI
    IU - Bloomington and IUPUI
  • Date & time
    Monday, February 18, 2019
    1pm-4pm
  • Date & time
    Monday, February 25, 2019
    1pm-4pm

Register here

About this event

Note: There are three workshops in this series; You must register for the whole series.  If you are unable to attend a specific day, please contact the instructor for material.

This is a three-part workshop that will cover the basics of R—the general syntax of the language, the basic data types and how to manipulate them, as well as how to find more information when doing novel analyses. The course does not focus on any particular analysis, but uses DNA sequences as a case study to apply the material covered. We will also cover using Jetstream (the research cloud) to power analyses in RStudio, but use of personal installations on laptops is fine for the workshop. The goal of this course is to get you started in R, so you’ll be able to read and write code, and figure out where to get help when needed.


Objectives:
By the end of this workshop, users will be able to:

  • Navigate and use RStudio (on and off Jetstream)—load files, export graphs, etc.
  • Understand how to install, load, and use new libraries.
  • Become familiar with Bioconductor Project.
  • Understand basic data types, functions, objects, and classes in R.
  • Write and use a function.

Prerequisites:

  • Unix familiarity is a plus, but not required.
  • For this workshop, there are workstations already provided in the classroom - no laptop is required.

Agenda:

  • Day 1: Introduction
    The goal of this section is to get you acquainted with R, both the environment and the language. We’ll discuss data types, manipulation, the structure of commands, how to get help and more information, how to load packages, and how to use the environment. The hope is to make using R more intuitive.

    This section does not focus on any individual analysis or demonstration. It focuses on reading and making sense of the language (this is very helpful for new users or anyone currently copying, pasting, and hoping).

    Requirements: There are no requirements for this section. Basic Unix skills (how variables work, cat, pwd, etc.) are helpful, but we won’t be using command line at all, just referencing them throughout. 

    Lab: There will be a self-guided activity to practice your skills after the initial workshop.  This will give you practice using R and working with sequence data/vectors.
     
  • Day 2: Introduction to Visualizaiton
    We will build on the basic data types and syntax of R to explore visualization of geological data. The two main families of plotting will be introduced (plot style and ggplot style) with examples of how to plot various types of data on geographical maps.  This is a useful skill for ecologists and geneticists alike.

    Requirements: This is a lab based on the material covered in day 1, so familiarity with that material is very, very useful (day 1 material will be available online).

    Lab: After walking through geographical mapping together, a self-guided activity will extend the same plotting syntax types to a different kind of data - plotting ordination (PCA, PCoA, mNDS plots) for use in exploring various data you may have. Microbiome, ecological, or population genetics are common examples.
     
  • Day 3: Making your own scripts and functions
    The goal of this section is to get a bit more in depth on how to read, understand, and troubleshoot R code—by introducing classes and functions. Classes and functions are a large part of R, and therefore a large part of understanding the syntax and function of the language. We will walk through creating your own function for summarizing tables of data (both ecological and genetic data sets are available for use).

    Requirements: This material assumes basic usage of R covered in the previous two days, or a moderate familiarity with R basics.

    Lab: A self-guided walkthrough building on Day 1's Lab, where you will create a function to graph a sliding window plot for GC content.  This activity is meant to practice building functions, but this particular example can easily be applied to visualize the variation across any continuos data, such as ecological  measure through time, population variation over a genome, etc.

This workshop is taught by Research Technologies a division of University Information Technology Services in conjunction with the National Center for Genome Analysis Support.  Both are affiliates of the IU Pervasive Technology Institute.


View all the workshops in this series by visiting http://go.iu.edu/24xc .


ncgas_logo.png 

 

 

 

 

RT-Education Outreach and TrainingSupercomputing for Everyone superheroes