Question: Convert VCF to MAF from RNASeq mutation analysis
0
gravatar for Ron
2.7 years ago by
Ron970
United States
Ron970 wrote:

Hi all,

I am using this software for RNAseq mutation analysis: https://github.com/davidliwei/rnaseqmut

My final output file is a VCF file and I want to convert it to a MAF file.

I have come across these posts,however the output VCF from RNAseq is a bit different in this case.

Converting Vcf File To Maf

Vcf To Maf (Mutation Annotation Format) Conversion ?

chr1    877831  rs6672356;COSM4144217   T   C   1.0 Sample_6385_11.DP4=0,0,1,2;Sample_G1.DP4=0,0,1,0;Sample_G2.DP4=0,0,0,0;Sample_G3.DP4=0,0,0,0;Sample_G4.DP4=0,0,0,0;Sample_G5.DP4=0,0,0,1;Sample_NY_D_TA23_PDX_T1.DP4=0,0,0,0;Sample_NY_D_TA23_PR.DP4=0,0,0,0;Sample_Pa_PDX.DP4=0,0,0,0;Sample_PA_primary.DP4=0,0,1,0;Sample_TE_PDX.DP4=0,0,0,0;Sample_TE_primary.DP4=0,0,0,0;   ASP=true;GNO=true;HD=true;INT=true;KGPROD=true;KGPhase1=true;NSM=true;OTHERKG=true;REF=true;RS=6672356;RSPOS=877831;SAO=0;SLO=true;SSR=0;VC=SNV;VP=0x050100080a05000516000100;WGT=1;dbSNPBuildID=116;AA=p.W343R;CDS=c.1027T>C;CNT=23;GENE=SAMD11;SNP=true;STRAND=+;EFF=missense_variant(MODERATE|MISSENSE|Tgg/Cgg|p.Trp343Arg/c.1027T>C|681|SAMD11|protein_coding|CODING|ENST00000342066|10|1),missense_variant(MODERATE|MISSENSE|Tgg/Cgg|p.Trp250Arg/c.748T>C|588|SAMD11|protein_coding|CODING|ENST00000341065|8|1|WARNING_TRANSCRIPT_NO_START_CODON),missense_variant(MODERATE|MISSENSE|Tgg/Cgg|p.Trp169Arg/c.505T>C|540|SAMD11|protein_coding|CODING|ENST00000455979|4|1|WARNING_TRANSCRIPT_NO_START_CODON),downstream_gene_variant(MODIFIER||1753||749|NOC2L|protein_coding|CODING|ENST00000327044||1),downstream_gene_variant(MODIFIER||1753|||NOC2L|retained_intron|CODING|ENST00000483767||1),downstream_gene_variant(MODIFIER||1754|||NOC2L|retained_intron|CODING|ENST00000477976||1),downstream_gene_variant(MODIFIER||3160||178|SAMD11|protein_coding|CODING|ENST00000420190||1),downstream_gene_variant(MODIFIER||2868|||NOC2L|processed_transcript|CODING|ENST00000496938||1),downstream_gene_variant(MODIFIER||278|||SAMD11|processed_transcript|CODING|ENST00000478729||1),non_coding_exon_variant(MODIFIER|||n.286T>C||SAMD11|retained_intron|CODING|ENST00000464948|1|1),non_coding_exon_variant(MODIFIER|||n.389T>C||SAMD11|retained_intron|CODING|ENST0000474461|3|1),non_coding_exon_variant(MODIFIER|||n.191T>C||SAMD11|retained_intron|CODING|ENST00000466827|2|1)

chr1    878314  rs142558220;COSM426784  G   C   1.0 Sample_6385_11.DP4=5,3,5,4;Sample_G1.DP4=0,0,0,0;Sample_G2.DP4=0,0,0,0;Sample_G3.DP4=1,0,0,0;Sample_G4.DP4=1,0,0,0;Sample_G5.DP4=2,1,0,0;Sample_NY_D_TA23_PDX_T1.DP4=2,3,0,0;Sample_NY_D_TA23_PR.DP4=0,1,0,0;Sample_Pa_PDX.DP4=2,1,0,0;Sample_PA_primary.DP4=1,0,0,0;Sample_TE_PDX.DP4=0,0,0,0;Sample_TE_primary.DP4=0,0,0,0;ASP=true;INT=true;KGPROD=true;KGPhase1=true;OTHERKG=true;REF=true;RS=142558220;RSPOS=878314;SAO=0;SSR=0;SYN=true;VC=SNV;VP=0x050000080305100016000100;WGT=1;dbSNPBuildID=134;AA=p.G480G;CDS=c.1440G>C;CNT=2;GENE=SAMD11;SNP=true;STRAND=+;EFF=synonymous_variant(LOW|SILENT|ggG/ggC|p.Gly480Gly/c.1440G>C|681|SAMD11|protein_coding|CODING|ENST00000342066|11|1),synonymous_variant(LOW|SILENT|ggG/ggC|p.Gly387Gly/c.1161G>C|588|SAMD11|protein_coding|CODING|ENST00000341065|9|1|WARNING_TRANSCRIPT_NO_START_CODON),synonymous_variant(LOW|SILENT|ggG/ggC|p.Gly306Gly/c.918G>C|540|SAMD11|protein_coding|CODING|ENST00000455979|5|1|WARNING_TRANSCRIPT_NO_START_CODON),downstream_gene_variant(MODIFIER||1270||749|NOC2L|protein_coding|CODING|ENST00000327044||1),downstream_gene_variant(MODIFIER||1270|||NOC2L|retained_intron|CODING|ENST00000483767||1),downstream_gene_variant(MODIFIER||1271|||NOC2L|retained_intron|CODING|ENST00000477976||1),downstream_gene_variant(MODIFIER||3643||178|SAMD11|protein_coding|CODING|ENST00000420190||1),downstream_gene_variant(MODIFIER||2385|||NOC2L|processed_transcript|CODING|ENST00000496938||1),downstream_gene_variant(MODIFIER||761|||SAMD11|processed_transcript|CODING|ENST00000478729||1),downstream_gene_variant(MODIFIER||132|||SAMD11|retained_intron|CODING|ENST00000466827||1),downstream_gene_variant(MODIFIER||42|||SAMD11|retained_intron|CODING|ENST00000464948||1),non_coding_exon_variant(MODIFIER|||n.802G>C||SAMD11|retained_intron|CODING|ENST00000474461|4|1)

for example the annotation column has each Sample with 4 values namely reference allele reads and alternate allele reads Sample_6385_11.DP4=0,0,1,2

Any suggestions on how to get this data to a format like this?

 Chromosome Start_position  End_position    Strand  Variant_Classification  Variant_Type    Reference_Allele    Tumor_Seq_Allele1   Tumor_Seq_Allele2   dbSNP_RS    Tumor_Sample_Barcode    Matched_Norm_Sample_Barcode n_alt_count n_ref_count t_alt_count t_ref_count amino_acid_change_WU
X   47044502    47044502    +   Nonsense_Mutation   SNP G   G   T   novel   UTUC123_1       0   96  28  48  p.E667*
2   192701329   192701329   +   Missense_Mutation   SNP C   C   T   novel   UTUC123_1       0   81  18  49  p.V200M
5   112824048   112824048   +   In_Frame_Ins    INS -   #NAME?  #NAME?  novel   UTUC123_1       0   17  7   14  p.S22_nofs
11  62286810    62286810    +   Missense_Mutation   SNP T   T   C   novel   UTUC123_1       0   111 25  72  p.K5027E

Thanks, Ron

rna-seq mutation next-gen • 873 views
ADD COMMENTlink modified 2.7 years ago by Abdul Rafay Khan1.0k • written 2.7 years ago by Ron970
1
gravatar for Abdul Rafay Khan
2.7 years ago by
Karachi, PK
Abdul Rafay Khan1.0k wrote:

try python script https://github.com/cbare/vcf2maf

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Abdul Rafay Khan1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 632 users visited in the last hour