To identify the expression levels of repeats in my RNA-Seq samples, I'm trying to count reads over repeats. The RNA-seq data (strand specific, dUTP ...) has been mapped and non-uniquely mapped reads (NH:i:X>1) were separated for further analysis. But which program should I use to count reads over gtf file (RepeatMasker) ?
Htseq-count (-s reverse) returns me only zero values and report all as __alignment_not_unique: (reads with more than one reported alignment)
But I used default, which is 0
The default for the latest version is 10. So it seems you are using an older version. I looked more into it and here is what I found. From version 0.4.6 onwards, HTSeq count skips reads that are non-uniquely mapped according to the 'NH' field. Source: http://www-huber.embl.de/users/anders/HTSeq/doc/history.html#version-0-5-4
I just
grep
'd for "NH" incount.py
script that comes with HTSeq and found the following lines:You need to modify "1" to some number suitable for your analysis and HTSeq-count should work for you. Though highly unlikely but please make sure that there are no restrictions with modifying the code.
Thank you so much :)