Matching read IDs to mapped transcripts/genes
1
0
Entering edit mode
2.3 years ago
joshcylee • 0

Hi all,

Recently I have tried to analyse direct RNA seq data, and I started looking at length of polyA tail length. Using nanopolish (using fast5 files and previously aligned bam files) I have lists of reads (reads name/ID) and their corresponding genomic location (chr and position) where the polyA tail starts, and the estimated tail length for each read. Now I'm would like to see which gene/transcript each individual read was actually mapped to, which unfortunately isn't provided in the analysis. I am trying to generate list of chr/position as bed file to bedtools and match it up with GTF file using bedtools closest (bedtools closest -t first -D a -id -s). However, it's struggling to distinguish genes/transcripts that are closed together. I'm just wondering whether there is a way to extract mapping information from bam file to generate a complete list of where every single read (read ID/name) and which transcript/gene the read is mapped to? If there's a way to do this then I can quickly match up polyA length for each gene/transcript.

Any help or comments would be much appreciated!

Josh

Transcriptomics RNAseq polya • 693 views
ADD COMMENT
0
0
Entering edit mode

Thanks Pierre. Problem is that the genomic locations are often just outside of annotated gene location so I don't know if bedtools intersect would work?

When I did bedtools closest, for some reasons I have a lot more features/rows in the output file than the input bed file (for example, 4425747 vs 2582308). I was thinking that after giving the option -t first it would have just given me the first, closest feature so there should be more rows in the output file? Again any help would be amazing. Thanks again!

ADD REPLY
0
Entering edit mode

Ah sorry I know what you meant now - thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6