Question: Stats on WGS variant calls
0
gravatar for abbysue
10 days ago by
abbysue10
abbysue10 wrote:

What's the best way to collect some statistics on 4 mil variants identified with GATK pipeline (and annotated with Funcotator)? I'm having trouble parsing info in the 'Funcotator' annotation since it contains several pieces of info that I want, separated by pipe characters. I want to know # of Intronic vs Exonic, # Missense and Nonsense

Sorry, here's what one row looks like

chrY    56844194        G       GAT     2390.06 PASS    2       1.00    2       NA      NA      57      3.0103  0.000  [Unknown|hg38|chrY|56844194|56844195|IGR||INS|-|-|AT|g.chrY:56844194_56844195insAT|no_transcript|||||||0.4675|GTATTGTGAGATCTCTGCAC|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||hg38|OREG1947633|Type_%3D_TRANSCRIPTION_%20_FACTOR_%20_BINDING_%20_SITE_%7C_Gene_Symbol_%3D_CTBP2P1_%7C_Gene_ID_%3D_ENSG00000235857_%7C_Gene_Source_%3D_ENSEMBL_%7C_Regulatory_Element_Symbol_%3D_ZNF263_%7C_Regulatory_Element_ID_%3D_ENST00000219069_%7C_Regulatory_Element_Source_%3D_ENSEMBL_%7C_PMID_%3D_18971253_%7C_Dataset_%3D_PAZAR||||||||||||||||||||||||||false|false||false|false||false|false|false||false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|false|||false|false||false||false||false|false|false||false|||false|||] 21.00    59.11   NA      true    NA      36.53   199154.00       NA      0.808

Here's a link that explains Funcotator annotations

I'm new to using the terminal, but I imagine this can be done using grep -w to look for a specific string (intron, exon, ...).

EDIT - I solved this with grep -w string file.table > file.txt

variants wgs • 120 views
ADD COMMENTlink modified 10 days ago by reza.jabal330 • written 10 days ago by abbysue10
1
gravatar for reza.jabal
10 days ago by
reza.jabal330
New York, USA
reza.jabal330 wrote:

Welcome to the world of bash scripting! Lets have your Funcotator results in Funcotator.txt:

For missense count: grep -w 'MISSENSE' Funcotator.txt | wc -l

For nonsense count: grep -w 'NONSENSE' Funcotator.txt | wc -l

For intronic count: grep -w 'INTRON' Funcotator.txt | wc -l

For exonic count: (grep -v '#' | wc -l) - (grep -w 'no_transcript' Funcotator.txt | wc -l)

ADD COMMENTlink written 10 days ago by reza.jabal330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1055 users visited in the last hour