NCGAS hosts their first national workshop on de novo transcriptome assembly using HPC resources

The National Center for Genome Analysis Support (NCGAS) held a workshop entitled "de Novo Assembly of Transcriptomes using HPC Resources" on April 30th, 2018 through May 1, 2018. This workshop serves NCGAS's mission of enabling the biological research community to analyze, understand, and make use of the genomic information now available, by packaging our now seven years of experience assisting with de novo transcriptome assemblies and running High Performance Computing (HPC) resources into a documented, easily approachable workflow for our users. The workshop covered common questions and problems that our users have had in HPC (such as job handling, resource availability, data management, and troubleshooting) and in the construction of transcriptomes (such as software choices, combination of assemblies, and downstream analyses). The two-day workshop also highlighted the available resources for US scientists, concentrated on available Extreme Science and Engineering Discovery Environment (XSEDE) resources for analyses, visualization, and archiving of data.

Workshop participants work in groups to plan which resources to use for assigned projects, which were based on the participants own projects. Mock ups of IU/XSEDE Jetstream and Wrangler, IU Carbonate, Karst, and SDA, PSC/XSEDE Bridges, TACC Stampede2 and various web resources were set up around the room, with information on each. This was listed as one of the favorite activities by many participants.

A total of 30 people attended the workshop from 26 institutions (four Minority Serving Institutions) in 13 states/1 territory (4 EPSCoR states/1 territory). These participants were made up of 3 Master’s students, 11 PhD students, 7 post-docs, 6 faculty, 2 staff, and 1 “other”.  The workshop served to advance many projects led by these participants, including: 

  • several projects on crustaceans, looking into the variable eyelessness in crabs to investigate large-scale physical changes within populations, formation of biominerals in shells, and even the genetics behind the visual complexity of mantis shrimp (they have 12 types of photoreceptors as compared to our three!)
  • population genetics projects on diversity of ocean biofilms, coral ecosystems, and the population-based variation in the gene expression of single cells in humans.
  • investigating the genetics of diseases in domestic dogs and resistance to disease in crop plants.
  • determining genes related to stress tolerance in rice, using birds and fish as a means of measuring chemical contamination levels in the Great Lakes, and the effect of large scale nuclear and chemical events such as Chernobyl meltdown and Deep Water Horizon oil breech on gene expression in bird populations.

The workshop was well received, with participants commenting in the post-workshop survey that this was “one of the most useful workshops in my doctoral degree” and that they “learned how to do in 2 days what [they’ve] have been trying to do on my own for more than 5 months.” The activities not only “gave [them] the confidence to tackle and use published bioinformatics pipelines that [they] would otherwise have been too intimidated to try and run” but the introduction to nationally available HPC resources, such as XSEDE Bridges and Jetstream, was also of a huge benefit for the participants and these machines will likely serve to power their future analyses. Only one of the students had heard of XSEDE before the workshop! Since the workshop we’ve had a total of six new allocations, in addition to the five workshop participants that already had them. Many applications were seeking to use PSC/XSEDE Bridges for their assemblies and IU/XSEDE Jetstream for collaborations.



There was a self reported 20% increase in confidence in each individual task and an overall 31% increase in comfort level in general transcriptome assembly after the two days of lectures, activities, and demos.  This increase in confidence and skill will push forward the litany of interesting NSF projects led by the participants.  Additionally, with several faculty and post-docs joining the workshop, the information will hopefully propagate through the 26 institutions represented at the event.