I've just perform an exome-seq and I've obtained the vcf file. Now to continue with my experiment, I need to extract the flanking regions wt and mut type of my dataset because I need to synthesize that for an immunotherapy research. I mean, in my vfc file I have a column like this:
AAChange.refGene A2M:NM_000014:exon30:c.C3797A:p.A1266E ABCC12:NM_033226:exon12:c.G1738T:p.G580C ABL1:NM_005157:exon11:c.C2972T:p.A991V,ABL1:NM_007313:exon11:c.C3029T:p.A1010V
And the desire output is like this:
Wt Epitope Mut Epitope TVVALHALSKYGAATFTRTGKAAQV TVVALHALSKYGEATFTRTGKAAQV DHQRYQHTVRVCGLQKDLSNLPYGD DHQRYQHTVRVCCLQKDLSNLPYGD APVPSTLPSASSALAGDQPSSTAFI APVPSTLPSASSVLAGDQPSSTAFI
In case I've more than one transcritp, I'll need the first one. I know how to obtain the the flanking regions of nucleotides, but I had not find anything similar like a refGene.txt of amino acids. I've used hg19 as genome reference.
Any help is welcome!