Question: What to do to identify rare variatns from vcf files resulted from UnifiedGenotyper?
0
gravatar for thejustpark
2.6 years ago by
thejustpark80
United States
thejustpark80 wrote:

Hello,

I am new to exome sequencing data anslysis and want to ask questions regarding what to do in my situation. I have spent quite a time to figure this out by myself, but since there's no one around me to direct me, I couldn't get much. I am given vcf files suspected to be from GATK UnifiedGenotyper on case and control samples (A1 and A2 are our case and B1 is our control), namely case[or control].indel.raw.vcf, case[or control].snp.raw.vcf, case[or control].var.raw.vcf. Now, I need to identify 1) rare variants ((SNPs or indels) with frequency less than 0.01% in EXaC or GenomAD) present only in the two cases and not in the control. 2) PolyPhen/SIFT or other scores for the identified rare variants.

My questions are 1) GATK manual says that, since UnifiedGenotyper would produce many false positives, these files need to go through a lot of filtering processes. However, I can't find in the manual what kind of filters I need to apply using what kind of tools. 2) Can you please give me the pipeline to identify rare variants and PolyPhen/SIFT scores from the vcf files?

Thank you very much for your time.

next-gen • 624 views
ADD COMMENTlink modified 2.6 years ago by rse90 • written 2.6 years ago by thejustpark80
0
gravatar for finswimmer
2.6 years ago by
finswimmer14k
Germany
finswimmer14k wrote:

Hello thejustpark,

about your question concerning filter false positive variants I recommend reading this blog post first. After that you can try your first steps with this tutorial. But keep in mind that there is no gold standard for doing hard filtering, as it depends on so many things.

The easiest way for you to get population frequencies is to use ensembl's VEP. For missense variants it also report Polyphen/SIFT scores. There are other ways to do this. The term you are looking for is "Variant Annotation". Tools that can do this are for example SnpEff and bcftools.

fin swimmer

ADD COMMENTlink written 2.6 years ago by finswimmer14k
0
gravatar for rse
2.6 years ago by
rse90
Singapore
rse90 wrote:

Hi, you can annotate against population db's and then filter

ADD COMMENTlink written 2.6 years ago by rse90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1932 users visited in the last hour