Question: Best resources for understanding GATK best practices pipeline (or similar)
0
gravatar for Tails
17 days ago by
Tails30
New Zealand
Tails30 wrote:

I've been trying to find a resource for understanding the various steps in the GATK pipeline or more generally, mapping and alignment, and variant calling in general. There seems to be a huge learning curve involved and every resource I've come across assumes you've worked with the data for quite a while or you're super smart and able to figure out what's unsaid.

Even a concept as simple as how the forward and reverse reads are generated, and what strand bias is exactly, is buried under mountains of technical documents.

I've checked out the broadE videos, and they are quite useful. But is anyone aware of any other good introductory resources that give a broad overview of "how we do genomics", and possibly highlighting the difficulties we might encounter with determining what "truth" is, and the hurdles we may encounter? A book or real-life examples would be nice.

ADD COMMENTlink modified 16 days ago by Kevin Blighe42k • written 17 days ago by Tails30
0
gravatar for Kevin Blighe
16 days ago by
Kevin Blighe42k
Kevin Blighe42k wrote:

Generally, if you want to get into genomics, you should have a linux 'shell' (BASH, SH, CSH, ZSH, etc) (all major operating systems now support shells) and a machine that has at least 8GB RAM (for full genome alignment). You should know how to download and install programs like SAMtools, BWA, BCFtools, etc.

-------------------------------

For learning a bit more about sequencing, you should know that the most widespread method is SBS (sequencing by synthesis), a technology that was purchased from SOLEXA by Illumina many years ago (Illumina 'never' invents anything on its own). Here is a video that goes over the SBS process:

-------------------------------------

But is anyone aware of any other good introductory resources that give a broad overview of "how we do genomics"

We do genomics by going into work and reading emails, prepping samples in the lab, processing data, attending meetings, etc.

----------------------------------------

and possibly highlighting the difficulties we might encounter with determining what "truth" is, and the hurdles we may encounter?

The 'truth' about which you speak is hidden in the cells, tissues, etc. that we study. We are limited to how precisely we can measure this truth by the very instruments that we use, which each have certain detection sensitivities and associated error. The SBS method (mentioned earlier), for example, is fraught with error, and for good reason (from Illumina's perspective) the level of error is buried in documents not available to the public.

----------------------------------------

For GATK, specifically, start here: Introduction to the GATK Best Practices

Kevin

ADD COMMENTlink modified 16 days ago • written 16 days ago by Kevin Blighe42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 814 users visited in the last hour