Question: Marking Duplicates And Comparing 2 Samples?
gravatar for Angel
8.7 years ago by
United States
Angel210 wrote:


Two follow-up questions for BioStar:

1) I used Picard "MarkDuplicates" to mark duplicates and called bam as "marked.bam". I used snpeff to annotate snps and indels etc. The file written using marked.bam is smaller than un-marked bam (~5000 rows smaller). So I am assuming snpeff is not taking into account PCR duplicates. Is this correct?

2) Now I want to compare snpeff result between two samples. ANy recommendations for softwares for this? I obviously will compare SNP and INDELS etc.

Thanks very much again. *[Edited]

exome statistics • 1.9k views
ADD COMMENTlink modified 7.3 years ago by Biostar ♦♦ 20 • written 8.7 years ago by Angel210

snpeff does not call SNPs, it annotates them. Are you missing a step (calling variants) in the explanation?

ADD REPLYlink written 8.7 years ago by brentp23k

SOrry ... thanks for correcting me. I am using vcftools to call variants after marking duplicates. I am using snpeff to annotate the variants.

ADD REPLYlink written 8.7 years ago by Angel210


Since no one replied, I tried converting snpeff output to BED format and use "subtractBed" to find differences between two samples. Is it the only way/right way to do it?

The differences are many, so how would I find something that is biologically meaningful?

ADD REPLYlink written 8.7 years ago by Angel210

You need to rewrite your question and make sure you explain each step. I couldnt understand your question? In comments you have mentioned that you used vcftools to call SNPs. Do you mean bcftools from samtools that uses mpileup output from samtools and does the variant calling? Also, snpEff input doesn't have PCR duplication information as that information is lost by that time.

ADD REPLYlink written 7.3 years ago by Ashutosh Pandey12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1708 users visited in the last hour