How to filter nanopore transcriptome alignments to trust 3' ends?

0

Entering edit mode

3.9 years ago

josiegleeson1 • 0

I have direct RNA data mapped to the gencode transcriptome with minimap2. Finding the 'true' transcript of origin for a read is nontrivial as there are many secondary alignments with very close alignment scores to the primary. After visualising I can see some alignments are to transcripts which start further 3 prime than my alignment. However, due to the mechanism of direct RNA sequencing, the three prime ends of reads are the true end site.

I want to discard alignments to transcripts that have a 3' start site over 100nt prior to my read start site.

I've thought about simply extracting TES from the gencode gtf but these are genomic coordinates and I need to use the transcriptome mapping. Another way I've been thinking is if the query end site is over 100nt of my read end site, to discard the alignment. But I am not sure how to do this, any ideas? Thanks.

nanopore direct RNA minimap2 • 1.2k views

ADD COMMENT • link updated 2.6 years ago by c0tton • 0 • written 3.9 years ago by josiegleeson1 • 0

0

Entering edit mode

Did you end up solving this issue? I am facing it now... direct RNA sequencing is tough!

ADD REPLY • link 2.6 years ago by c0tton • 0

0

Entering edit mode

the problem looks simple but I would need a example bam with a few reads to test.