Hi,
I am working on pareid end RNA-Seq data (151 bases). As I recived the files I decided to process it and trim the low quality reads and adapter sequences. I used bbduk and I left the minimum length to be considered of a read after low quality reads and adapter trimming to 23. Leaving the folowing Sequence lenght distribution in all my files which was something I would expect, my question is the folowing.
Giving my sequence distribution and that the low proportion of reads being less than 145 bases long are less than 15% of the files. Is this something that could generate a big bias in my results ?
It is standard paried-end RNA-Seq I will be using for differential expression analysis.
Should be fine to move on with your analysis then. Multi-mapped reads will not be counted by default by counting programs. Keep an eye on alignment percentages.
Thank you ! I have a last question, giving that this are cell-line derived samples from human, what would be a good alignment percentage for this kind of samples. I am finding difficult to find "standard" go-to measures for this kind of metrics.
Are these total RNAseq or mRNAseq? Something north of 70-75% should be expected for mRNAseq as long as the quality of RNA/libraries was good.
I suggest to make answers out of some of these comments so the threads have a chance to be closed as completed. For example, in this thread it is probably OK that the first response was a comment, but the one I am replying to seems like an answer. When a thread starts with a comment, it seems unlikely that it will ever be reverted to an answer by the same responder. That not only makes long responses to a comment that are difficult to read, but also removes any chance that a comment would be accepted.
Without going back very far, I think that responses in at least two other threads should have been answers rather than comments:
My default MO is to answer rather than comment, unless I have to ask the OP many questions before answering. Not saying that is the best approach, but it feels like many moderators' responses are comments by default when they could easily be answers. I didn't even want to go back very far and look through Pierre's posts, for whom I think for sure many comments could be answers. Pointing to existing posts is a legitimate answer in my book, assuming it addresses the original question.