2018 SUMMER SCHOOL FOR BIG DATA IN BIOLOGY


REGISTRATION IS CLOSED. Stay tuned for 2019 summer classes.

The Center for Computational Biology and Bioinformatics at The University of Texas at Austin is proud to host the 4th Annual Summer School for Big Data in Biology May 21–24, 2018. The summer school provides a unique hands-on opportunity to acquire valuable skills directly from experts in the field, with courses tailored towards novices or intermediate and advance users.

This year we are offering 6 courses. Each will meet for four half-days (either mornings or afternoons) for a total of twelve hours. Instructors will post lectures, datasets, exercises, and course information on a website accessible to enrolled participants. There will be no examinations, but participants may request certificates of completion. Academic credit will not be issued. Please carefully check the specified prerequisite knowledge before enrolling in a course. Payment information for courses is at the bottom of this page.

The archive of 2017 courses can be found here

  1. UTEID: To obtain a UTEID, go here
  2. TACC: To sign up for a TACC account, go here.

Make sure your Unix and TACC skills are up to date for the Summer School!

Many courses prefer familiarity with Unix and TACC. If you are in need of a refresher for these topics, check out the following 3-hour short courses.

Introduction to Unix: Wednesday, May 2. For course details and to register, visit: this site

Introduction to TACC: Wednesday, May 9. For course details and to register, visit: this site

TOPIC MORNING COURSES Mon - Thur, May 21-24, 9 a.m.-12 p.m. AFTERNOON COURSES Mon - Thur, May 21-24, 1:30 p.m.-4:30 p.m.
Programming

Intro to Python

Intro to Biocomputing

DNA and RNA sequencing methods and analyses

Introduction to Core NGS Concepts and Tools

Introduction to RNA-Seq

Genome Variant Analysis

Machine Learning Methods in Gene Expression Analysis

Course Descriptions

REGISTRATION FOR COURSES IS CLOSED

TOPIC: PROGRAMMING

Introduction to Biocomputing (THIS COURSE IS CLOSED)

Day and Time: Mon-Thur 1:30 p.m. – 4:30 p.m. Location: PAR 204
Description: This course is a quick introduction to basics of different programming languages and paradigms useful in biology. It will touch on the Unix command line, basics of using TACC, a quick introduction to Python, and some tidbits of each useful for working with bioinformatics data files. No previous programming experience will be assumed.
Instructor: Benjamin (Benni) Goetz, M.S., Bioinformatics Consultant
Instructor Bio: Benni is a bioinformatics consultant in CCBB. Python, Bash, and huge computing clusters are some of his favorite things. In a previous life, Benni studied pure math: differential geometry in particular.
Computer requirement: Students must bring their own laptops. Windows users should have PuttySSH installed. PuttySSH can be found here: http://bfy.tw/HjQA Mac or Linux users will not need to install anything extra.

 Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

Introduction to Python (THIS COURSE IS CLOSED)

Day and Time: Mon-Thur 9:00 a.m. – 12:00 p.m. Location: PAR 208
Description: This course will introduce students to the fundamentals of scientific computing using the Python language, one of the most popular languages in computational biology and bioinformatics. This course assumes *no prior knowledge* of either Python or programming in general. Students will master the fundamental skills including control flow, functions, and some command line interaction, and finally learn some basic Python skills for working with sequence data.
Instructor: Stephanie Spielman, Ph.D.
Instructor Bio: Dr. Spielman is a Research Assistant Professor in the Institute for Genomics and Evolutionary Medicine at Temple University. She obtained her Ph.D. from UT's EEB program, through Claus Wilke's lab. Her research interests broadly encompass bioinformatics, phylogenetics, and comparative genomics.
Preferred of prerequisite skills: This course is intended for beginners interested in learning fundamental skills in computer programming in Python, with an application to biological and sequence data analysis. The course is geared towards novices and assumes no prior knowledge of computer programming.
Computer requirement: Students should have their own laptop computer with UNIX capability (Mac, Linux, or Windows 10 all acceptable).

   Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

TOPIC: DNA AND RNA SEQUENCING ANALYSIS

Introduction to Core NGS Concepts and Tools (THIS COURSE IS CLOSED)

Day and Time: Mon-Thur 9:00 a.m. - 12:00 p.m. Location: PAR 206
Description: This course provides an introduction to the concepts and vocabulary of Next Generation Sequencing (NGS) with an emphasis on common protocols, tools and file formats used in NGS data analysis. Subjects covered include quality assessment and manipulation of raw NGS sequences (FastQC, cutadapt), read mapping (bwa, bowtie2), the Sequence Alignment Map (SAM) format, and tools for manipulating BAM files (samtools, bedtools). Participants will gain hands-on experience using these and other NGS tools in the Linux command line environment at TACC, as well as exposure to the many bioinformatics resources TACC makes available.
Instructor: Anna Battenhouse (Research Engineering/Scientist Associate I)
Teaching Assistants: Claire McWhite (Graduate student in the Marcotte lab) and Haridha Shivram (Graduate student in the Iyer lab)
Computer requirement: Students must use their own laptops.
UTEID and TACC Account required: Attendees must have UT EIDs for access to our course wiki, as well as accounts on TACC. Please be sure you know both your UT EID and your TACC username when you come to class. To obtain a UTEID, go here. To sign up for a TACC account, go here.

Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

Genome Variant Analysis (THIS COURSE IS CANCELLED)

Day and Time: Mon-Thur 9:00 a.m. – 12:00 p.m. Location: PAR 204
Description: This four-day course is designed to teach you the computational steps necessary to identify variant DNA sequences from next generation sequencing data via a series of interactive tutorials designed to provide hands-on familiarity with a variety of analysis tools (such as Trimmomatic, fastQC, SAMtools, Bowtie2, BWA, breseq, IGV, GATK, and more). Major data analysis topics covered will include read pre-processing, analyzing read quality, genome assembly, read alignment, detection of single nucleotide variants, detection of structural variants, visual representation of such variants, rare variant detection within population, target enrichment strategies, and more. Additionally, factors to consider when designing sequencing experiments (such as type of sequencing, sequencing platform, how much sequencing is needed, and alternative library preparation methods) will also be discussed. Initially the class will focus on prokaryotic samples as many of the same principles of analysis will apply, later portions of the class will provide an option for each participant to choose between more in-depth prokaryotic analysis or eukaryotic analysis depending on personal relevance.
Instructor: Daniel E. Deatherage, Ph.D., Postdoctoral Researcher
Instructor Bio: Daniel Deatherage earned his doctorate at The Ohio State University studying epigenetic effects of ovarian cancer. His postdoctoral work in Dr. Jeffrey Barrick’s lab has focused on using next generation sequencing to identify ultra rare mutations within evolving populations. He has been teaching or assistant teaching this class for 6 years.
Preferred or prerequisite skills Bio: The use of interactive tutorials allows self paced progress meaning no background required; however, familiarity with command line is helpful and will allow you to complete more content during the course.
Computer requirement: Students must provide laptops able to connect to TACC and transfer files to and from TACC.
TACC Account required: Attendees will need a TACC account. Please be sure you know your TACC username when you come to class. To sign up for a TACC account, go here.

  Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

Machine Learning Methods in Gene Expression Analysis (THIS COURSE IS CLOSED)

Day and Time: Mon-Thur 1:30 p.m. - 4:30 p.m. Location: PAR 208
Description: This four-day course will introduce a selection of machine learning methods used in bioinformatic analyses of RNA-seq and other types of gene expression data (RT-qPCR, etc.). We will cover normalization, unsupervised learning and clustering, feature selection and extraction, and supervised learning methods for classification (e.g., random forests, SVM, LDA, kNN, etc.) and regression (with an emphasis on regularization methods appropriate for high-dimensional problems). Participants will have the opportunity to apply these methods as implemented in R and python to publicly available data.
Instructor: Dennis Wylie, Ph.D., Bioinformatics Consultant
Instructor Bio: Dennis Wylie joined the CBRS Bioinformatics group in 2015. He has experience in NGS data analysis including variant calling and RNA-Seq-based biomarker discovery and predictive modeling (classification, regression, etc.). Prior to UT, he earned a PhD in Biophysics from UC Berkeley applying stochastic simulation methods to problems in immunology, did postdoctoral work modeling the transmission of infectious disease, and spent six years as a bioinformatician in industry.
Preferred or prerequisite skills: This course is recommended for students with some prior knowledge of either R or python. Discussion will include some references to mathematical concepts from probability, statistics, and linear algebra, but course will emphasize intuitive understanding of what these methods do and how they may be applied regardless of mathematical background.
Computer requirements: Students will need laptops, and should have R or python installed prior to the class if possible. A list of R packages or python libraries will be sent shortly before the class. It is strongly recommended to install these prior to class.
TACC Account required: Attendees will need a TACC account. Please be sure you know your TACC username when you come to class. To sign up for a TACC account, go here.

  Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

Introduction to RNA-Seq (THIS COURSE IS CLOSED)

Day and Time: Mon-Thur 1:30 p.m. - 4:30 p.m. Location: PAR 206
Description: This four-day course provides an introduction to methods for analysis of RNA-seq data. It assumes familiarity and comfort with Linux command line and TACC. A typical RNA-seq workflow will be featured, starting from quality assessment of raw data, mapping (bwa, HISAT2), differential expression analysis (DESeq2, ballgown), splice variant analysis (StringTie) and downstream analyses and visualization. Participants will gain hands-on experience using these tools in a Linux command line environment at TACC.
Instructor: Dhivya Arasappan, M.S., Clinical Assistant Professor
Instructor Bio: Dhivya Arasappan joined UT's Genome Sequencing and Analysis Facility (GSAF) as a Bioinformatician in 2009. She has over 8 years experience analyzing NGS data from multiple platforms including Illumina, PacBio and SOLiD. Her areas of expertise include de novo genome assembly, particularly using hybrid sequencing data, RNA-Seq analysis, exome analysis, and benchmarking of bioinformatics tools. She also teaches the Big Data in Biology Freshmen Research Initiative stream.
Preferred or prerequisite skills: This course is intended for students who are familiar with Unix, TACC, and R programming. It is highly recommended to take these short courses: Intro to TACC, as well as Intro to Unix.
Computer requirement: Students will need laptops, which will be used to ssh into TACC clusters. Laptops must have an ssh client installed. Macs come with an ssh client, so nothing further needs to be installed. On windows laptops, putty and winSCP will need to be installed.
TACC Account required:: Attendees will need a TACC account. Please be sure you know your TACC username when you come to class. To sign up for a TACC account, go here.

  Please read this disclaimer if you are using the UT ProCard for payment!

Back to top

REGISTRATION AND PAYMENT

We will accept personal credit cards (American Express, MasterCard, Visa, Discover), UT ProCards (but please read this for important information regarding the use of the ProCard that could effect your registration), and IDT (interdepartmental transfer). Registration dates and fees are as follows:

Registration dates

Category

Registration Fees

Feb 15, 2018 - May 13, 2018

UTAustin or BEACON

  • Students* $175/course
  • Faculty or Staff* $275/course

UT-System

  • Students* $275/course
  • Faculty or Staff* $275/course

Non-UT Other

  • Students** $275/course
  • Participants $550/course
  • Groups of 5 or more from same agency or institution: $440 per person/course (Call 512-471-5261 for group registration)
  • * Our staff will confirm affiliations with UT.
  • ** Non-UT students must send us a copy of their current student identification. Send PDF scan to this email address
  • Contact our office at 512-471-5261 for more information.

Refund and Cancellation Policy
A full refund of registration fees, less a $25 cancellation fee, will be available if requested in writing and received by May 16, 2016. No refunds will be made after that date. Please note that course substitutions cannot be made. If you fail to cancel by the deadline and do not attend, you are still responsible for full payment. UT-Austin reserves the right to cancel courses and to return all fees in the event of insufficient registration.

Miscellaneous

Location: All workshops will take place in the PAR building.

Food: Beverages and snacks will be served during 15 minute morning and afternoon breaks. There are also soda and snack machines located in the MEZ building across the way.

Parking: Parking on campus is at a premium. Since the Summer School occurs during the break between the spring and summer semesters, the UT Shuttles will not be operating. The nearest parking garages are the Brazos parking garage (BRG) located at 210 E. MLK, the San Antonio Garage (SAG) located at 2420 San Antonio Street, and the AT&T Executive Education & Conference Center parking garage (CCG) located at 1900 University Avenue. Parking in any of the "A," "F," "D," "OV" or "O" spaces on campus might result in the issuance of a citation. Rates for all campus garages.

Visitor Information: 


• UT Campus Visitor Center information

UT Austin maps
Austin and the UT Drag Travel Guide