Question: cutadapt low read adapters percentage
0
gravatar for Morris_Chair
11 months ago by
Morris_Chair170
Morris_Chair170 wrote:

Dear Community, I finally received the fastq files from a service and I'm making clipping and trimming. I noticed that for all my output files I have a low percentage of adapters found so I wonder how this is possible,maybe I got a wrong barcode?

Thank you

=== Summary ===

Total reads processed:              25,153,241
Reads with adapters:                   234,803 (0.9%)
Reads that were too short:                 472 (0.0%)
Reads written (passing filters):    25,152,769 (100.0%)

Total basepairs processed: 3,221,015,905 bp
Quality-trimmed:                 481,378 bp (0.0%)
Total written (filtered):  3,213,762,494

bp (99.8%)

rna-seq adapter • 435 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by Morris_Chair170
1

Typically the read length is smaller than the fragment size so adapters are not expected to be found frequently. The only exception that comes to my mind is ATAC-seq and smallRNA-seq. What kind of data is that? Did you run fastqc before to check for adapter contamination? Note that a barcode is not the same as an adapter.

ADD REPLYlink modified 11 months ago • written 11 months ago by ATpoint31k
1

Barcode/index sequences are not the same thing as adapters as @ATPoint already noted. Adapters contain index sequences, which are read as an independent read in Illumina sequencing.

In well made libraries it is perfectly fine to find very little adapter contamination.

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax80k

Hi guys, This are RNA sequencing from whole human RNA asked for 8 milion reads, 1x75 SE (happy for having more than 8M of reads per file) For each fastqc I checked the quality of the fastq file and only in a couple of files I have an exclamation mark for the adapter content parameter. When I run cutadapt I use all the sequence that the service gave me like:

Index Adapter 5′ GATCGGAAGAGCACACGTCTGAACTCCAGTCAC CTTGTA GATCTCGTATGCCGTCTTCTGCTTGATGCCGTCTTCTGCTTG

 cutadapt -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAGATCTCGTATGCCGTCTTCTGCTTGATGCCGTCTTCTGCTTG -q 20 -m 25 -o folder/R1.fq.gz  R1.fq.gz

The barcode is in bold which is the only part that changes for all my samples, am I doing right?

Thank you

ADD REPLYlink modified 11 months ago • written 11 months ago by Morris_Chair170
1

The barcode is in bold which is the only part that changes for all my samples

If you had more than one sample then yes. Trimming programs will generally look for the core sequence GATCGGAAGAGCACACGTCTGAACTCCAGTCA that is common for all adapters. Once they find it, they will remove all sequence 3' including the core.

ADD REPLYlink written 11 months ago by genomax80k

Thank you genomax, I don't know if was important to say that I m clippnig my samples one by one.

ADD REPLYlink modified 11 months ago • written 11 months ago by Morris_Chair170
1

Your samples have been demultiplexed (you have separate files, correct?). Then index sequences have already been taken into account in that process.

ADD REPLYlink written 11 months ago by genomax80k

yes, I have separated files

ADD REPLYlink written 11 months ago by Morris_Chair170
1

I would say you have a normal sample and things are expected. I would proceed with the downstream analysis.

ADD REPLYlink written 11 months ago by ATpoint31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1053 users visited in the last hour