Question

Ways to mapping a transcriptome against a single gene

0

Entering edit mode

2.8 years ago

arturo.marin ▴ 20

Hi,

I post this question because I only found old information in the forum. I am working with a transcriptome that don't have an important (for my work) gene annotated, but this gene is in the GenBank. So, my question is, What is the best way to do the mapping against this gene?

Now I am testing HISAT2, but it is slow and it seems that it is mapping wrong some sequences. The files that it is produce are too long compare to the mapping against to all genome. I have read on this forum that many mappers can give false positives if the mapping is not done against the entire genome. This is because they give the reads that map with a certain score, even if the alignment is imperfect. Is there a mapper that can control this?
I suppose another alternative would be to try to locate this gene in the assembled genome file, modify the annotation file and do the mapping and other parts of the process with the modified files. What program or approach would be better to locate the gene in the genome? minimap2, blastn ...?

annotation RNAseq • 557 views

ADD COMMENT • link updated 2.8 years ago by dsull ★ 5.8k • written 2.8 years ago by arturo.marin ▴ 20

score 1 · Answer 1 · 2021-07-01

I would map against the entire genome, modified to include your gene of interest. If you're only mapping against a single gene, you won't really gain much information (e.g. you won't have any idea of to what extent that gene is expressed relative to other genes).

If you just want to check if the transcript's sequence pops up a few times in the fastq file, you can simply "grep" a portion of that sequence with your fastq file.

If alignment is too slow, you can use pseudoalignment methods (like kallisto) and modify the reference transcripts to include your transcript(s) of interest.