I have a standard rna-seq dataset (125bp PE Illumina ) from a model organism. I am only doing adapter trimming and no quality trimming since the quality is excellent all the way through. There is an option in the trimming software to set minimum read len to keep. I was wondering what would be a good length and why.
My thoughts are are along these lines.
Set min length around 10-12: Would it help to keep short non coding RNAs if at all? I use ribosome depletion and not polyA capture.
Set min length around 60: Might reduce mapping time and potentially reduce multiple mapping of very short reads.
Set min length close to max length. ie; around 100 to 120: Depending on the sequence length distribution after trimming, I could potentially lose a lot of reads. Would it help with further downstream dge analysis to keep read length distribution is in a tighter range?
I could be wrong with all of these so feel free to correct me. And also some good suggestions for min length.