Question: How to detect False positive variant in repeat region ?
0
gravatar for istdasklar
6 months ago by
istdasklar0
istdasklar0 wrote:

My variant caller returns variants like the following one in repeat region. ( PGM / Amplicon ) .
How can I know if there are true of false ? I mean caused by sequencing error

IGV IGV2

bam calling vcf • 375 views
ADD COMMENTlink modified 6 months ago by Kevin Blighe28k • written 6 months ago by istdasklar0
2
gravatar for Kevin Blighe
6 months ago by
Kevin Blighe28k
USA / Europe / Brazil
Kevin Blighe28k wrote:

Hey, didn't you have another account? - your profile photo looks familiar.

The only way to confirm a variant is through the use of an ancillary method, like Sanger sequencing. NGS always struggles to correctly call variants in repeat regions whose length approaches the average read length that you're using. Why? - in part, it is due to the issue of mis-alignment in these regions. Even prior to in silico alignment, homopolymers like AAAAAAA, GGGGGGGG, etc., can be difficult to faithfully sequenced during the sequence run itself.

To guard against errors in repeat regions, you can do some basic QC thresholds:

  • Prior to alignment, trim bases at read ends whose average base qualities fall below 30
  • Prior to alignment, eliminate short reads
  • Prior to variant calling, eliminate reads with MAPQ<40, 50, or 60
  • Require that variants are called at minimum of 18 read depth
  • Require that variants have 'high' genotype qualities (at least 30)
  • Only look at variants that pass a threshold for strand bias (given by PV4 tag)
ADD COMMENTlink written 6 months ago by Kevin Blighe28k

To guard against errors in repeat regions, you can do some basic QC thresholds:

  • Prior to alignment, trim bases at read ends whose average base qualities fall below 30
  • Prior to alignment, eliminate short reads
  • Prior to variant calling, eliminate reads with MAPQ<40, 50, or 60
  • Require that variants are called at minimum of 18 read depth
  • Require that variants have 'high' genotype qualities (at least 30)
  • Only look at variants that pass a threshold for strand bias (given by PV4 tag)
  • Take a variant caller that do denovo assembly like freebayes or GATK's HaplotypeCaller
ADD REPLYlink written 6 months ago by finswimmer5.4k

Thanks for your reply. I will check all the things !

ps : Yes, I logged to a wrong account to post my question .. sorry !

ADD REPLYlink written 6 months ago by sacha1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 773 users visited in the last hour