Question: Stats on WGS variant calls
0
gravatar for abbysue
10 months ago by
abbysue10
abbysue10 wrote:

What's the best way to collect some statistics on 4 mil variants identified with GATK pipeline (and annotated with Funcotator)? I'm having trouble parsing info in the 'Funcotator' annotation since it contains several pieces of info that I want, separated by pipe characters. I want to know # of Intronic vs Exonic, # Missense and Nonsense

Sorry, here's what one row looks like

chrY    56844194        G       GAT     2390.06 PASS    2       1.00    2       NA      NA      57      3.0103  0.000  [Unknown|hg38|chrY|56844194|56844195|IGR||INS|-|-|AT|g.chrY:56844194_56844195insAT|no_transcript|||||||0.4675|GTATTGTGAGATCTCTGCAC|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||hg38|OREG1947633|Type_%3D_TRANSCRIPTION_%20_FACTOR_%20_BINDING_%20_SITE_%7C_Gene_Symbol_%3D_CTBP2P1_%7C_Gene_ID_%3D_ENSG00000235857_%7C_Gene_Source_%3D_ENSEMBL_%7C_Regulatory_Element_Symbol_%3D_ZNF263_%7C_Regulatory_Element_ID_%3D_ENST00000219069_%7C_Regulatory_Element_Source_%3D_ENSEMBL_%7C_PMID_%3D_18971253_%7C_Dataset_%3D_PAZAR||||||||||||||||||||||||||false|false||false|false||false|false|false||false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|||false|false||false||false||false|false|false||false|||false|||] 21.00    59.11   NA      true    NA      36.53   199154.00       NA      0.808

Here's a link that explains Funcotator annotations

I'm new to using the terminal, but I imagine this can be done using grep -w to look for a specific string (intron, exon, ...).

EDIT - I solved this with grep -w string file.table > file.txt

variants wgs • 269 views
ADD COMMENTlink modified 10 months ago by reza.jabal370 • written 10 months ago by abbysue10
1
gravatar for reza.jabal
10 months ago by
reza.jabal370
New York, USA
reza.jabal370 wrote:

Welcome to the world of bash scripting! Lets have your Funcotator results in Funcotator.txt:

For missense count: grep -w 'MISSENSE' Funcotator.txt | wc -l

For nonsense count: grep -w 'NONSENSE' Funcotator.txt | wc -l

For intronic count: grep -w 'INTRON' Funcotator.txt | wc -l

For exonic count: (grep -v '#' | wc -l) - (grep -w 'no_transcript' Funcotator.txt | wc -l)

ADD COMMENTlink written 10 months ago by reza.jabal370
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1681 users visited in the last hour