Question: How To Filter The Snps/Indel Results From Mpileup?
1
gravatar for Haiping
8.2 years ago by
Haiping110
Haiping110 wrote:

Hi. I trid to use mpileup to identified the SNP/Indels. The commands I used were:

samtools fillmd –bAr sample.sorted.bam ref.fa > sample.sorted.baq.bam

samtools mpileup -uf ref.fa aln.bam | bcftools view -bvcg - > var.raw.bcf

bcftools view var.raw.bcf | vcfutils.pl varFilter -D 100 > var.flt.vcf

I saw some of the SNPs that Qual value that low than 20 and Indels that low than 50.The problem is that I don't know if I can just trust this output resutls ar I still need to filter the resutls to make it reliable. If I need to filter, what kand of rule I should use? Thanks a lot.

Here are some of the results:

   1. 1102 . C T 3.01 . DP=10;AF1=0.4997;AC1=1;DP4=1,7,2,0;MQ=60;FQ=4.77;PV4=0.067,1,1,1 GT:PL:GQ 0/1:30,0,147:28
   2. 14689 . A G 18.1 . DP=10;AF1=0.5;AC1=1;DP4=0,6,0,4;MQ=60;FQ=21;PV4=1,1.1e-05,1,1 GT:PL:GQ 0/1:48,0,116:51
   3. 9373 . C A 44 . DP=10;AF1=0.5;AC1=1;DP4=0,6,4,0;MQ=60;FQ=47;PV4=0.0048,0.014,1,1 GT:PL:GQ 0/1:74,0,115:77
   4. 6427 . T TT 14.6 . INDEL;DP=9;AF1=0.5025;AC1=1;DP4=0,1,0,2;MQ=56;FQ=-14.7;PV4=1,1,0.12,1 GT:PL:GQ 0/1:52,0,20:23
   5. 314 . AA A 18.5 . INDEL;DP=9;AF1=0.5;AC1=1;DP4=4,0,4,0;MQ=60;FQ=18.5;PV4=1,0.0011,1,1 GT:PL:GQ 0/1:56,0,56:56
   6. 6068 . GATTAG G 214 . INDEL;DP=9;AF1=1;AC1=2;DP4=0,0,2,7;MQ=60;FQ=-61.5 GT:PL:GQ 1/1:255,27,0:51
mpileup • 6.9k views
ADD COMMENTlink modified 3.8 years ago by Biostar ♦♦ 20 • written 8.2 years ago by Haiping110
1
gravatar for Swbarnes2
8.2 years ago by
Swbarnes21.5k
Swbarnes21.5k wrote:

Your depth of coverage is kind of low. Also, most of these SNPs displayed are mixed. So it's possible they are real, but I'd be skeptical. (But I don't have much empirical sanger to back that claim up) The last one, that has a decent quality score, because it has 9 reads that all agree that's an indel.

ADD COMMENTlink written 8.2 years ago by Swbarnes21.5k

Thanks for you response. The average depth of my data were near 30. Here are just part of the resutls. I just wonder that do I need to filger some of the results like that for 1,2,4 and 5 since the qual are less than 20 for SNP and 50 for Indels.

ADD REPLYlink written 8.1 years ago by Haiping110

Thanks for you response. The average depth of my data were near 30. Here are just part of the resutls. I just wonder that do I need to filter some of the results like that for 1,2,4 and 5 since the qual are less than 20 for SNP and 50 for Indels

ADD REPLYlink written 8.1 years ago by Haiping110
1
gravatar for Travis
8.0 years ago by
Travis2.8k
USA
Travis2.8k wrote:

The Broad Institute would probably recommend filtering based on recalibrated variant scores for a relatively low coverage experiment like this. Have a look here.

ADD COMMENTlink written 8.0 years ago by Travis2.8k
1
gravatar for Leszek
7.5 years ago by
Leszek4.0k
IIMCB, Poland
Leszek4.0k wrote:

Beside mentioned filtering, I often discard calls that are confirmed by alignments from one strand only as this is likely due to sequencing errors. Have a look at this discussion.

ADD COMMENTlink written 7.5 years ago by Leszek4.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1960 users visited in the last hour