Question: Stats on WGS variant calls
0
gravatar for abbysue
20 months ago by
abbysue10
abbysue10 wrote:

What's the best way to collect some statistics on 4 mil variants identified with GATK pipeline (and annotated with Funcotator)? I'm having trouble parsing info in the 'Funcotator' annotation since it contains several pieces of info that I want, separated by pipe characters. I want to know # of Intronic vs Exonic, # Missense and Nonsense

Sorry, here's what one row looks like

chrY    56844194        G       GAT     2390.06 PASS    2       1.00    2       NA      NA      57      3.0103  0.000  [Unknown|hg38|chrY|56844194|56844195|IGR||INS|-|-|AT|g.chrY:56844194_56844195insAT|no_transcript|||||||0.4675|GTATTGTGAGATCTCTGCAC|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||hg38|OREG1947633|Type_%3D_TRANSCRIPTION_%20_FACTOR_%20_BINDING_%20_SITE_%7C_Gene_Symbol_%3D_CTBP2P1_%7C_Gene_ID_%3D_ENSG00000235857_%7C_Gene_Source_%3D_ENSEMBL_%7C_Regulatory_Element_Symbol_%3D_ZNF263_%7C_Regulatory_Element_ID_%3D_ENST00000219069_%7C_Regulatory_Element_Source_%3D_ENSEMBL_%7C_PMID_%3D_18971253_%7C_Dataset_%3D_PAZAR||||||||||||||||||||||||||false|false||false|false||false|false|false||false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|||false|false||false||false||false|false|false||false|||false|||] 21.00    59.11   NA      true    NA      36.53   199154.00       NA      0.808

Here's a link that explains Funcotator annotations

I'm new to using the terminal, but I imagine this can be done using grep -w to look for a specific string (intron, exon, ...).

EDIT - I solved this with grep -w string file.table > file.txt

variants wgs • 375 views
ADD COMMENTlink modified 20 months ago by reza.jabal440 • written 20 months ago by abbysue10
1
gravatar for reza.jabal
20 months ago by
reza.jabal440
New York, USA
reza.jabal440 wrote:

Welcome to the world of bash scripting! Lets have your Funcotator results in Funcotator.txt:

For missense count: grep -w 'MISSENSE' Funcotator.txt | wc -l

For nonsense count: grep -w 'NONSENSE' Funcotator.txt | wc -l

For intronic count: grep -w 'INTRON' Funcotator.txt | wc -l

For exonic count: (grep -v '#' | wc -l) - (grep -w 'no_transcript' Funcotator.txt | wc -l)

ADD COMMENTlink written 20 months ago by reza.jabal440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1869 users visited in the last hour
_