I investigate a protein which binds small DNA (<30 nt) and have a library of these small DNA. I know that adapters and indexes are from this site (5' adapter has T instead of U). [To reach the page I want to show click on the second "view online help bottom" and after that click "TruSeq kits" and after that click "TruSeq small RNA". You should see the page like this]
At the start I had this level of adapter's contamination And the quality of bases was ok
To remove adapters a ran this command
java -jar trimmomatic-0.39.jar SE inp_file out_file ILLUMINACLIP:adapters.fa:4:30:7 MINLEN:14 MAXLEN:21
I have question about which adapter's sequences should I use. On this site they provide a special sequence for trimming. But I tried to use 3'- and 5'-adapters sequences together and it had some effect on the output (e.g. I got less overrepresented sequences) but adapter content was null even with only one trimming sequence that was suggested.
I used MAXLEN:20 because of this It seems to me that information is contained only in the first 20 bases (it is consistent with the length of nucleic acids which my protein binds). Is it ok to just clip these bad bases?
Additionally, I know that reads have 'TrueSeq index 1'. I tried to use the sequence for index from this site but it didn't have a big effect. Does index just get removed naturally during the process of adapter trimming or do I have to do something with this?
And also after all of this I still have a small number of overrepresented sequences
Thanks in advance and best regards!