Most frequently mutated genes in VCF
1
0
Entering edit mode
9.1 years ago
Ron ★ 1.2k

Hi all,

I am doing mutation analysis from RNAseq data and I have a VCF file. I want to get most frequently mutated genes(most number of mutations per gene) from the VCF file,and used snpEFF

count number of variants per gene from .vcf annovar&snpsift

However I dont think this is the output what I need.Please see the output below:

chr start   end type    IDs Reads:ALLMUT_FILTERED_Annotated
1   1   11873   Intergenic;DDX11L1  1   1
1   1   11873   Intergenic;DDX11L1  1   1
1   1   249250621   Chromosome;1    638541  638541
1   6874    11873   Upstream;NR_046018.2;;DDX11L1;Non-coding_transcript 1   1
1   6874    11873   Upstream;NR_046018.2;;DDX11L1;Non-coding_transcript 1   1
1   9362    14361   Downstream;NR_024540.1;;WASH7P;Non-coding_transcript    7   7
1   9362    14361   Downstream;NR_024540.1;;WASH7P;Non-coding_transcript    7   7
1   11874   14409   Gene;DDX11L1;Non-coding_transcript  6   6
1   11874   14409   Transcript;NR_046018.2;;DDX11L1;Non-coding_transcript   6   6
1   12369   17368   Downstream;NR_106918.1.3;;MIR6859-1.3;Non-coding_transcript 170 170
1   12369   17368   Downstream;NR_106918.1.3;;MIR6859-1.3;Non-coding_transcript 170 170
1   12369   17368   Downstream;NR_107062.1.2;;MIR6859-2.2;Non-coding_transcript 170 170
1   12369   17368   Downstream;NR_107062.1.2;;MIR6859-2.2;Non-coding_transcript 170 170
1   12722   13220   Intron;intron_2_RETAINED-RETAINED;NR_046018.2;;DDX11L1;Non-coding_transcript    1   1
1   13221   14409   Exon;exon_3_3_RETAINED;NR_046018.2;;DDX11L1;Non-coding_transcript   5   5
1   14362   14829   Exon;exon_11_11_RETAINED;NR_024540.1;;WASH7P;Non-coding_transcript  74  74
1   14362   29370   Transcript;NR_024540.1;;WASH7P;Non-coding_transcript    361 361
1   14362   29370   Gene;WASH7P;Non-coding_transcript   361 361
1   14410   19409   Downstream;NR_046018.2;;DDX11L1;Non-coding_transcript   233 233
1   14410   19409   Downstream;NR_046018.2;;DDX11L1;Non-coding_transcript   233 233

Let me know if there is any way to do that,or I can just the count the mutations per gene by writing a script?

Thanks,

Ron

RNA-Seq mutation VCF • 2.0k views
ADD COMMENT
0
Entering edit mode
9.1 years ago
Ram 45k

Check out -freq in vcftools.

ADD COMMENT

Login before adding your answer.

Traffic: 4549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6