Whole Exome Sequencing analysis pipeline
1
2
Entering edit mode
4.4 years ago
Rick P ▴ 20

Hi everyone!

I have started recently my adventure in the bioinformatic world. I have made some RNA-Seq analysis, as differential expression and Gene Set Enrichment Analysis, with the help of several pipelines available out there.

Right now I'm starting to analyse WES raw data from tumor and normal-adjacent tissue in order to find somatic mutations, although I'm facing some difficulties finding the most appropriate pipeline to do so. I know there's the GATK pipeline, as well as the GDC's. However, I'm wondering whether there's other complete and comprehensive pipelines, with information about software and command-line guidelines.

Thanks in advance.

sequencing • 1.2k views
ADD COMMENT
2
Entering edit mode
4.4 years ago

There are, technically speaking, an innumerable number of pipeline configurations out there. The standard GATK workflow about which most people are aware are for germline variant calling, whereas, you are interested in somatic variant calling.

Most (or all) programs will assume that you are starting from aligned BAM files that you have QCd yourself. Programs that will then do the somatic variant calling include:

More can be found via Awesome Bioinformatics Benchmarks: Somatic SNV/Indel callers

MuTect should work fairly well for you, as it was developed at Broad Institute (where GATK was also developed).

I used to work a lot in this area in the past, and I only now use Lancet for somatic and indel variant calling. It was developed at New York Genome Center. For somatic copy number calling, I use FREEC; however, I would be inclined to try out German.M.Demidov 's program too.

Kevin

ADD COMMENT
0
Entering edit mode

Thanks for your reply.

Yes, I'm interested in somatic mutations. Isn't this GATK Best Practices: Somatic short variant discovery (SNVs + Indels) workflow adequate for my task?

In fact, I already have VCF files for my BAM files (the files were provided to me already in both BAM and VCF format).

Thank you for your help. I will check out those tools!

ADD REPLY
1
Entering edit mode

Oh yes, but, as you'll see, they use MuTect2 for the actual variant calling in that pipeline. MuTect2 comes from the same developers as GATK.

ADD REPLY
0
Entering edit mode

Yes, I know that. Thanks again.

One more question, as any pipeline has not clarified it to me: After variant calling, what is the next step in the analysis? Annotation? If so, what are the programs which allow me to perform that step?

ADD REPLY
1
Entering edit mode

Try Cancer Genome Interpreter from Nuria López Bigaz lab :)

ADD REPLY

Login before adding your answer.

Traffic: 2745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6