I'm looking for an existing tool to see the read distribution over genes in an RNA-seq sample. Just to be clearer, I want something like the output of the script
read_distribution.pyinto the RseQC toolset but into every gene, not just a resume of the whole alignment file. Something like:
GENE_ID,3'UTR/CDS gene_1, n gene_2, n
What I've tried so far is to count specifically for 3' UTR and for CDS using
htseq-count: then I've normalized the count value to the feature length (so a value
count/kb ) and after I've done the relationship between the UTR and CDS values.
Now, could this approach considered legit to have a similar information, considering that I'm working with a poorly annotated genome? If no, is there a tool out there which does the task? Coding it by scratch in Python requires time and I don't want to reinvent the weel, so it's my last choice.