Question: BBDuk does not remove all adapters
0
gravatar for f.dinesh
29 days ago by
f.dinesh0
f.dinesh0 wrote:

Hey all,

I'm just getting into rna-seq analysis and have run into a problem that's is very confusing. I ran multiplexed RNAseq samples on a Hiseq, paired end, 100bp, reads. The data was demultiplexed and what I thought was trimmed (I used illumina truseq index primers).

So I run my R1 & R2 reads on fastqc and find ~20% illumina universal adapter sequences at the 3' end of my reads. I run BBDuk and it removes about 10% so I'm left with ~ 10% still contaminating the reads. Im wondering what I should do next???

The initial adapter content was a linear slope starting from position 55 to 85 bp, after BBduk the value went to 10% sloping up from 72 to 85bp. Links to adapters image graph


I'm not sure how to get rid of the rest of the adapters. Fastqc has also changed from an X to an ! So that might be ok then?

$ ./bbduk.sh in1=S42ND_L002_R1_001.fastq.gz in2=S42ND_L002_R2_001.fastq.gz out1=S42ND_L002_R1_001_TRIMMED.fastq.gz out2=S42ND_L002_R2_001_TRIMMED.fastq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo
java -ea -Xmx1205m -Xms1205m -cp /media/sf_ubuntushared/bbmap/current/ jgi.BBDuk in1=S42ND_L002_R1_001.fastq.gz in2=S42ND_L002_R2_001.fastq.gz out1=S42ND_L002_R1_001_TRIMMED.fastq.gz out2=S42ND_L002_R2_001_TRIMMED.fastq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo
Executing jgi.BBDuk [in1=S42ND_L002_R1_001.fastq.gz, in2=S42ND_L002_R2_001.fastq.gz, out1=S42ND_L002_R1_001_TRIMMED.fastq.gz, out2=S42ND_L002_R2_001_TRIMMED.fastq.gz, ref=adapters.fa, ktrim=r, k=23, mink=11, hdist=1, tpe, tbo]

Version 38.68

Thanks for any help

ADD COMMENTlink modified 29 days ago by genomax73k • written 29 days ago by f.dinesh0

Generally when trimming programs find core sequence that comprises the beginning of adapter sequence they should remove all sequence 3' to the location where that core sequence is found. Your command is common and should have worked.

Can you try doing the trimming using ktrim=rl (that is an R and L, lower case) option? Let us know what that does.

Also try bbmerge.sh on the original reads to see how many of your reads merge (a guide is available here). I have a feeling that you have short inserts compared to the length of sequencing.

ADD REPLYlink modified 29 days ago • written 29 days ago by genomax73k

So I was running the code above in parallel with writing the post. My previous code (not posted here) did not have hdist, tpe or tbo and I was getting the left over adapters. When I added hdist, tpe, tbo, to the code it removed all the adapter sequences. Adapters are gone

ADD REPLYlink written 29 days ago by f.dinesh0
1

My comments were based on the command you had posted. tpe tbo are indeed important parameters for paired end reads. Good to hear trimming worked.

ADD REPLYlink written 29 days ago by genomax73k

Appreciate the support and clarity!

ADD REPLYlink written 29 days ago by f.dinesh0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1987 users visited in the last hour