Question: What to do to identify rare variatns from vcf files resulted from UnifiedGenotyper?
0
gravatar for thejustpark
16 months ago by
thejustpark70
United States
thejustpark70 wrote:

Hello,

I am new to exome sequencing data anslysis and want to ask questions regarding what to do in my situation. I have spent quite a time to figure this out by myself, but since there's no one around me to direct me, I couldn't get much. I am given vcf files suspected to be from GATK UnifiedGenotyper on case and control samples (A1 and A2 are our case and B1 is our control), namely case[or control].indel.raw.vcf, case[or control].snp.raw.vcf, case[or control].var.raw.vcf. Now, I need to identify 1) rare variants ((SNPs or indels) with frequency less than 0.01% in EXaC or GenomAD) present only in the two cases and not in the control. 2) PolyPhen/SIFT or other scores for the identified rare variants.

My questions are 1) GATK manual says that, since UnifiedGenotyper would produce many false positives, these files need to go through a lot of filtering processes. However, I can't find in the manual what kind of filters I need to apply using what kind of tools. 2) Can you please give me the pipeline to identify rare variants and PolyPhen/SIFT scores from the vcf files?

Thank you very much for your time.

next-gen • 394 views
ADD COMMENTlink modified 16 months ago by rse90 • written 16 months ago by thejustpark70
0
gravatar for finswimmer
16 months ago by
finswimmer12k
Germany
finswimmer12k wrote:

Hello thejustpark,

about your question concerning filter false positive variants I recommend reading this blog post first. After that you can try your first steps with this tutorial. But keep in mind that there is no gold standard for doing hard filtering, as it depends on so many things.

The easiest way for you to get population frequencies is to use ensembl's VEP. For missense variants it also report Polyphen/SIFT scores. There are other ways to do this. The term you are looking for is "Variant Annotation". Tools that can do this are for example SnpEff and bcftools.

fin swimmer

ADD COMMENTlink written 16 months ago by finswimmer12k
0
gravatar for rse
16 months ago by
rse90
Singapore
rse90 wrote:

Hi, you can annotate against population db's and then filter

ADD COMMENTlink written 16 months ago by rse90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 668 users visited in the last hour