1
0
Entering edit mode
16 months ago

I investigate a protein which binds small DNA (<30 nt) and have a library of these small DNA. I know that adapters and indexes are from this site (5' adapter has T instead of U). [To reach the page I want to show click on the second "view online help bottom" and after that click "TruSeq kits" and after that click "TruSeq small RNA". You should see the page like this]

At the start I had this level of adapter's contamination And the quality of bases was ok

To remove adapters a ran this command

java -jar trimmomatic-0.39.jar SE inp_file out_file ILLUMINACLIP:adapters.fa:4:30:7 MINLEN:14 MAXLEN:21

I have question about which adapter's sequences should I use. On this site they provide a special sequence for trimming. But I tried to use 3'- and 5'-adapters sequences together and it had some effect on the output (e.g. I got less overrepresented sequences) but adapter content was null even with only one trimming sequence that was suggested.

I used MAXLEN:20 because of this It seems to me that information is contained only in the first 20 bases (it is consistent with the length of nucleic acids which my protein binds). Is it ok to just clip these bad bases?

Additionally, I know that reads have 'TrueSeq index 1'. I tried to use the sequence for index from this site but it didn't have a big effect. Does index just get removed naturally during the process of adapter trimming or do I have to do something with this?

And also after all of this I still have a small number of overrepresented sequences

Thanks in advance and best regards!

NGS small-DNA trimming trimmomatic sequencing • 811 views
0
Entering edit mode

Your first link is not working right. Was that a link out to some website or were you trying to show a screenshot? Can you fix that?

I don't know about trimmomatic but you could use bbduk.sh (in trim mode or filter mode to separate reads that contain expected adapter). Something like

bbduk.sh -Xmx4g in=input.fq.gz out=clean.fq.gz literal=adapter_sequence1,adapter_sequence2 .. k=8 ktrim=r

0
Entering edit mode

Yeah, It was a kind of site, I added a description. Thanks for your suggestion! But my question isn't about programs...

I want to get to know if it is enough to use for trimming the sequence the company suggests for trimming or should I use both adapter's sequences (the sequence they suggest for the trimming is equal to 3'-adapter). And my second question is about indexes, should I make something special to remove them? Provide some sequence to the program maybe?

2
Entering edit mode
16 months ago
GenoMax 123k

With smallRNA data it is prudent to follow the recommendations for data handling that are specific for the kit that was used. Since some smallRNA kits attach a special adapter to 3'-end of smallRNA, looking for presence of that adapter (to confirm that the molecule is valid smallRNA) and then trimming that is adequate. Note: You can almost see that adapter appearing on your FastQC plots (those artifacts that you see on tile plot) after the 21-22 bp (size of small RNA) where the reads start going into the adapters.

In Illumina sequencing index reads are never part of actual sequence (R1/R2) and you don't need to do anything special for them. They are automatically transferred to fastq headers of data files during demultiplexing and are only used for sample identification.

0
Entering edit mode

Thanks a lot for the explanation!