Separating novel from known junctions in TopHat junctions.bed file

0

Entering edit mode

7.9 years ago

damljanovic.a • 0

I was trying to extract novel junctions from the TopHat junctions.bed file, so I wrote a python script and I did that in two steps: first, I parsed GTF file and grouped all the exons (after filtering out CDS and start/stop codons) by gene_id and transcript_id and created a list of all possible junctions for each transcript by taking the ending and starting coordinates of adjacent exons. Second, I made a list of all true junction coordinates from the TopHat bed file by taking into account overhangs (refer to this post). Then, when I finally did the intersection of these lists, I've got an empty list. Does anyone have any idea what was wrong in my approach?

Thanks in advance, Ana

RNA-Seq junctions splicing TopHat annotations • 1.7k views

ADD COMMENT • link 7.9 years ago by damljanovic.a • 0

Login before adding your answer.

Similar Posts

Loading Similar Posts

Traffic: 3133 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6