Entering edit mode
3 months ago
bioinfo ▴ 60
I was asked to make a new reference where I remove one of the genes from a reference and add some isoforms. I have found several instances of this gene in the gtf file and I will remove them using regex. Do I need to remove anything from the fasta file too?
Thank you very much for replying. For the fasta file then it is ok to just add the isoform sequences?
I think you need some background on the basics. A genome is what is encoded as DNA on the chromosomes as plain sequence so any isoform is already there. It is the GTF that guides the aligner and the quantification at which position we hsve features such as intron gaps, exons etc. I am pretty sure 10x has a guide towards creation of custom CellRanger references.
Why would you add isoform sequences to a reference genome file? A reference genome contains the sequence for all possible isoforms in all cells.