Hi there, I know it might a naïve question regarding the read alignment. I have a small RNA seq data Fastq files (enriched specifically for small RNAs during lib prep, Illumina NovaSeq) that I wanna align to the reference genome. The problem is that average read length of the data is 100 bp and it is paired-end. I am aware typically people would do the single-end sequencing with much lesser read length to capture the small RNAs.
My questions are as follows: (1) Do I need to perform the adapter trimming before the Quality trimming? (2) I tried to perform adapter trimming keeping minimum read length of 15 and then trimmed off the bases > 22. Also, I did that similarly to keep bases > 60 as well. However, the problem I am encountering is that the % alignment to the reference genome is just ~50%, which should not be the case. I am wondering what is the best way I could do trimming and the alignment to be able to capture small RNAs of varying lengths?
I would really appreciate if you could please provide any insights on this. Thanks, Bhumi
Did you use a commercial kit? If so follow the instructions for that kit. You may need to look for a specific adapter. Depending on the quality of your libraries 50% may not be a bad result.
Interestingly, when I am aligning the reads using Bowtie 2 I am getting ~97% aligned reads, whereas with Bowtie v1 it is ~14% post adapter and quality trimming.
Are these 97% unique or multimappers?
Hi @ATpoint, yes the the unique paired are about 97ish %