Question: Analysing Multiple Exome Datasets
4
gravatar for User 9126
7.9 years ago by
User 912650
User 912650 wrote:

Dear All, We have whole exome sequencing data of ten samples from illumina. While I was looking for a pipeline to analyse this data i found a nice thread http://biostar.stackexchange.com/questions/1269/what-is-the-best-pipeline-for-human-whole-exome-sequencing

But, I am little bit confused in using this pipeline for all 10 samples together.

Should I first align each file to the genome independently,combine as single bam file and proceed further.?

or should I need to process every file independently through all these steps? If so what should I do finally to understand the result out of all 10 samples?

Thanks

Santhosh

exome sequencing • 2.0k views
ADD COMMENTlink modified 7.9 years ago by Sean Davis25k • written 7.9 years ago by User 912650
5
gravatar for Sean Davis
7.9 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

There are (at least) two places in processing and analyzing exome data that benefit from borrowing information from other samples. The first is when aligning around indels since one may borrow information form reads in all samples when searching for support for indels. The second is when calling variants; several variant callers allow one to specify multiple BAMs when calling variants to capitalize on all information when calling variants. That said, in our group, we process all samples pretty much independently and only combine for variant calling where having one large BAM file is not necessary.

At the end of the day, though, it pays to really think about your study and study design when deciding how to proceed with an analysis. While most people talk about a "pipeline" for exome sequencing, it is easy to define situations (related individuals, for example) where a "general pipeline" is not optimal.

ADD COMMENTlink written 7.9 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1466 users visited in the last hour