I have my reference in a fasta format. My gene model generated by Cufflinks in a GTF format and genomic variants in VCF format. I want to extract the fasta sequences of my transcripts from the reference but modified according to the VCF.
- if I changed the reference to a new consensus according to the VCF, the co-ordinates of GTF are ruined because of indels.
- if I got the transcripts from the reference first by gffread or getfasta, I can't use the VCF with the genomic co-ordinates to edit my transcripts.
- I tried to do liftover of the VCF from the genome to the transcriptome. I used UCSC kent tools (genePredToFakePsl then pslToChain) to make the chain file. Then I am using GATK to complete the liftover which fails (most probably because the FilterLiftedVariants can not manage the transcripts on the negative strand