Stats on WGS variant calls
Entering edit mode
3.7 years ago
abbysue ▴ 10

What's the best way to collect some statistics on 4 mil variants identified with GATK pipeline (and annotated with Funcotator)? I'm having trouble parsing info in the 'Funcotator' annotation since it contains several pieces of info that I want, separated by pipe characters. I want to know # of Intronic vs Exonic, # Missense and Nonsense

Sorry, here's what one row looks like

chrY    56844194        G       GAT     2390.06 PASS    2       1.00    2       NA      NA      57      3.0103  0.000  [Unknown|hg38|chrY|56844194|56844195|IGR||INS|-|-|AT|g.chrY:56844194_56844195insAT|no_transcript|||||||0.4675|GTATTGTGAGATCTCTGCAC|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||hg38|OREG1947633|Type_%3D_TRANSCRIPTION_%20_FACTOR_%20_BINDING_%20_SITE_%7C_Gene_Symbol_%3D_CTBP2P1_%7C_Gene_ID_%3D_ENSG00000235857_%7C_Gene_Source_%3D_ENSEMBL_%7C_Regulatory_Element_Symbol_%3D_ZNF263_%7C_Regulatory_Element_ID_%3D_ENST00000219069_%7C_Regulatory_Element_Source_%3D_ENSEMBL_%7C_PMID_%3D_18971253_%7C_Dataset_%3D_PAZAR||||||||||||||||||||||||||false|false||false|false||false|false|false||false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|||false|false||false||false||false|false|false||false|||false|||] 21.00    59.11   NA      true    NA      36.53   199154.00       NA      0.808

Here's a link that explains Funcotator annotations

I'm new to using the terminal, but I imagine this can be done using grep -w to look for a specific string (intron, exon, ...).

EDIT - I solved this with grep -w string file.table > file.txt

WGS variants • 737 views
Entering edit mode
3.7 years ago
reza.jabal ▴ 570

Welcome to the world of bash scripting! Lets have your Funcotator results in Funcotator.txt:

For missense count: grep -w 'MISSENSE' Funcotator.txt | wc -l

For nonsense count: grep -w 'NONSENSE' Funcotator.txt | wc -l

For intronic count: grep -w 'INTRON' Funcotator.txt | wc -l

For exonic count: (grep -v '#' | wc -l) - (grep -w 'no_transcript' Funcotator.txt | wc -l)


Login before adding your answer.

Traffic: 1387 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6