Question: How To Set Filter For Frequency Of Reads And Hapmap Exome Sample Results:
0
gravatar for Angel
7.3 years ago by
Angel210
United States
Angel210 wrote:

Hey All,

I only used so far three filters for my whole exome pipeline (aligning to hg19) for a HapMap sample. I tried it on the NA19240 Hapmap sample from paper below (Table 3) which shows ~196 variants (SNPs and INDELs). http://www.nature.com/nmeth/journal/v9/n2/extref/nmeth.1810-S1.pdf

However, using my filters as below I get = ~15000 (just NONSYNONYMOUSCODING alterations). If you add INDELS, it's going to be much higher number. What am I doing wrong?

My list of filters are:

1) vcfutils varFilter -D1000 2) snpEff -minQ 20 -minCoverage 30

Could they have different filters like frequency of variants etc.? If so, how do I set these up? Any help? What are the default parameters for # of reads (minimum) and frequency in bwa,samtools?

Below is my pipeline:

  • bwa aln hg19.fa S375R1.fastq > S3751.sai
  • bwa aln hg19.fa S375R2.fastq > S3752.sai
  • bwa sampe hg19.fa S3751.sai S3752.sai S375R1.fastq S375R2.fastq > S375NoIndexL007.sam
  • samtools view -bS S375NoIndexL007.sam > S375NoIndexL007.bam
  • samtools sort S375NoIndexL007.bam S375NoIndexL007.sorted
  • Marked duplicates using picard
  • samtools index S375NoIndexL007.marked.bam
  • samtools mpileup -uf hg19.fa S375NoIndexL007.marked.bam | bcftools view -bvcg - > S375NoIndexL007.raw.bcf
  • bcftools view S375NoIndexL007.raw.bcf | vcfutils.pl varFilter -D1000 > S375NoIndexL007vard200.flt.vcf
exome filter indel hapmap snp • 2.6k views
ADD COMMENTlink modified 7.3 years ago by Zev.Kronenberg11k • written 7.3 years ago by Angel210
1
gravatar for Zev.Kronenberg
7.3 years ago by
United States
Zev.Kronenberg11k wrote:

Filter your bam on mapping quality -q 20. Filter your variant quality using varfilter via samtools. Filter low depth sites using varfilter via samtools.

Realigning INDELS is very important! You should consider doing it if you care about calling INDELS correctly.

ADD COMMENTlink written 7.3 years ago by Zev.Kronenberg11k

HI Zev, thanks very much for your reply.

Q1 Does it matter if I re-align after marking duplicates. Just to avoid repeating the duplication?

Q2 How do I filter on maping quality. I found this link and I didn't understand th filter in the example. What filter on mapping quality do you recommend?

ADD REPLYlink written 7.3 years ago by Angel210

I would remove the duplicates and then do the realignment; it will go faster. Samtools view: '-q INT Skip alignments with MAPQ smaller than INT [0]'. 20 is what we use. This mean those reads that map with a quality below twenty are eliminated.

ADD REPLYlink written 7.3 years ago by Zev.Kronenberg11k

Thanks very much Zevm will try it. Currently, I am getting a couple of errors with re-alignment but will keep trying.

ADD REPLYlink written 7.3 years ago by Angel210

email me your error zev (dot) kronenberg (at) gmail.com

ADD REPLYlink written 7.3 years ago by Zev.Kronenberg11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1620 users visited in the last hour