Length of the amplicon after trimming
2
1
Entering edit mode
3.0 years ago
life99945 ▴ 20

Hi!

I have illumina myseq 16s rRNA amplicon reads 300bp length with 17 bp primer. After trimming length of this amplicon must be 283 bp max. BBDuk says that out of 300k sequences 298k were trimmed. Some sequences length is bigger that 283 bp. I wonder what these sequences are? Some sequences have for example 290 bp after trimming. Is it possible to amplificate sequence from part of the primer?

Thank you.

Trimming • 948 views
1
Entering edit mode
2.2 years ago

Nothing very abnormal here.

Key thing here is that you have to think from the "other direction", the trimming of the reads usually happens on the 'end' of the read. This is due to something called read-through, meaning that your sequence reaction sequences more bases than what is in your sample and thus ends up in the primer/adapter on the other end of the read (so not the primer the seq-reaction started from, that one indeed you can't have in your read data).

Especially in the case of amplicon sequencing this happens often as the input data is rather short (or from a well defined length) , so there is a high change you end up in the primer/adapter on the other side of your read.

Combine this with the knowledge that if there are for instance only a few bases of the adapter present none of the trimming/cleaning tools will recognise these and thus they will remain in your read data given cleaned/trimmed reads of a variety of lengths.

0
Entering edit mode
6 weeks ago
Jiacheng ▴ 10

Trimming accuracy varies in different trimmers. Illumina 300-bp sequences might have low quality in read tails, which makes some program hard to detect adapters. Also, it does not means the amplicon length is 300-17=283. The length of amplicon should be the region length you captured + custom primers without adapter sequences.

Here, I'd recommend atria to trim the adapter sequences. It is a newly-published cutting-edge trimmer with exceptional precision and speed.

Also, if you wish, you can also do a hard-clip on 3' end (--clip-after AMPLICON_LENGTH):

atria -r sample.fastq [-R sample_R2.fastq] -a ADAPTER1 [-A ADAPTER2] --clip-after AMPLICON_LENGTH