I know this topic has been discussed but I still have some doubts in its regard.
I've done RNA seq 2x150 bp and before receiving my data the company did a QC check and the library size was ~ 300bp. When I received my data I used FASTQC to check the data and the quality was quite good. There were no poor quality reads, the length was 150 but the adaptors were present.
So, I used bbmerge to find the adaptors, I trim them and then I merge the reads, however for one sample the avg insert size was 148bp and for the other samples were between 151bp and 173 bp. The % of joined reads was also high (the lowest % was 80% , but for most of the samples it was ~90%.
This is one example:
Adapters counted: 32566620 Total time: 710.930 seconds. Pairs: 62324086 Joined: 56271221 90.288% Ambiguous: 5211149 8.361% No Solution: 841716 1.351% Too Short: 0 0.000% Avg Insert: 148.1 Standard Deviation: 47.2 Mode: 122
Insert range: 35 - 291 90th percentile: 217 75th percentile: 176 50th percentile: 140 25th percentile: 113 10th percentile: 95
So, my question is should I be worried about the not so good insert size? Because at this stage there is nothing I could do about it rightsta? Plus should I trim the adaptors and then merge the reads or do the other way around? I think the first option should be the best since the quality around the edges of the reads is usually low right?
Thank you for your help!