Question

BBDuk does not remove all adapters

0

Entering edit mode

4.6 years ago

f.dinesh • 0

Hey all,

I'm just getting into rna-seq analysis and have run into a problem that's is very confusing. I ran multiplexed RNAseq samples on a Hiseq, paired end, 100bp, reads. The data was demultiplexed and what I thought was trimmed (I used illumina truseq index primers).

So I run my R1 & R2 reads on fastqc and find ~20% illumina universal adapter sequences at the 3' end of my reads. I run BBDuk and it removes about 10% so I'm left with ~ 10% still contaminating the reads. Im wondering what I should do next???

The initial adapter content was a linear slope starting from position 55 to 85 bp, after BBduk the value went to 10% sloping up from 72 to 85bp. Links to adapters image graph

I'm not sure how to get rid of the rest of the adapters. Fastqc has also changed from an X to an ! So that might be ok then?

$ ./bbduk.sh in1=S42ND_L002_R1_001.fastq.gz in2=S42ND_L002_R2_001.fastq.gz out1=S42ND_L002_R1_001_TRIMMED.fastq.gz out2=S42ND_L002_R2_001_TRIMMED.fastq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo
java -ea -Xmx1205m -Xms1205m -cp /media/sf_ubuntushared/bbmap/current/ jgi.BBDuk in1=S42ND_L002_R1_001.fastq.gz in2=S42ND_L002_R2_001.fastq.gz out1=S42ND_L002_R1_001_TRIMMED.fastq.gz out2=S42ND_L002_R2_001_TRIMMED.fastq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo
Executing jgi.BBDuk [in1=S42ND_L002_R1_001.fastq.gz, in2=S42ND_L002_R2_001.fastq.gz, out1=S42ND_L002_R1_001_TRIMMED.fastq.gz, out2=S42ND_L002_R2_001_TRIMMED.fastq.gz, ref=adapters.fa, ktrim=r, k=23, mink=11, hdist=1, tpe, tbo]

Version 38.68

Thanks for any help

RNA-Seq TruSeq Adapters BBduk trimming • 2.8k views

ADD COMMENT • link updated 4.6 years ago by GenoMax 141k • written 4.6 years ago by f.dinesh • 0

0

Entering edit mode

Generally when trimming programs find core sequence that comprises the beginning of adapter sequence they should remove all sequence 3' to the location where that core sequence is found. Your command is common and should have worked.

Can you try doing the trimming using ktrim=rl (that is an R and L, lower case) option? Let us know what that does.

Also try bbmerge.sh on the original reads to see how many of your reads merge (a guide is available here). I have a feeling that you have short inserts compared to the length of sequencing.

ADD REPLY • link 4.6 years ago by GenoMax 141k

0

Entering edit mode

So I was running the code above in parallel with writing the post. My previous code (not posted here) did not have hdist, tpe or tbo and I was getting the left over adapters. When I added hdist, tpe, tbo, to the code it removed all the adapter sequences. Adapters are gone

ADD REPLY • link 4.6 years ago by f.dinesh • 0

1

Entering edit mode

My comments were based on the command you had posted. tpe tbo are indeed important parameters for paired end reads. Good to hear trimming worked.