Question: Losing most PE reads using Cutadapt on 150bp PE metagenomes
gravatar for hannahfreund3
4 weeks ago by
hannahfreund30 wrote:

Hi everyone,

I am using cutadapt to cut some adapters on my PE, metagenome reads (150bpx2, Illumina NovaSeq).

Here is my code:

for i in *_R1_001.fastq;
    SAMPLE=$(echo ${i} | sed "s/_R1_\001\.fastq//")
    echo ${SAMPLE}_R1_001.fastq ${SAMPLE}_R2_001.fastq/n
    cutadapt --pair-adapters -pair-filter=both --discard-untrimmed -b ACTGCGAA -B TTCGCAGT -o ${SAMPLE}_R1_001_cut.fastq -p ${SAMPLE}_R2_001_cut.fastq  ${SAMPLE}_R1_001.fastq ${SAMPLE}_R2_001.fastq**


I've done this a few ways (keeping untrimmed sequences, not keeping them, etc) - and for whatever reason, I am keeping only 0.2% of my reads (will attach picture of results).

I ran the sequences through FastQC and every single metagenome is great quality - so I am not sure why I am losing so many reads!

Does anyone know why this could be happening, and how to improve the reads I am keeping?

I was thinking of trying trimmomatic but I am not sure how to go about using custom adapters for trimmomatic.

ADD COMMENTlink modified 4 weeks ago by h.mon31k • written 4 weeks ago by hannahfreund30

you are discarding untrimmed reads, perhaps your reads don't contain the adapters

ADD REPLYlink written 4 weeks ago by Istvan Albert ♦♦ 85k
gravatar for h.mon
4 weeks ago by
h.mon31k wrote:

As Istvan Albert said, you are discarding untrimmed reads, so all reads that do not contain the adapter will be removed - in general, this is useful when one is removing primers, not adapters.

Besides, it seems to me the sequences -b ACTGCGAA -B TTCGCAGT are barcodes, not adapters. Unless these are inline barcodes, they were sequenced during specific cycles in Illumina machines, are used to demultiplex a sequencing run, and are output to separate fastq files.

ADD COMMENTlink written 4 weeks ago by h.mon31k

Thank you for pointing that out, you're right.

I am trying to use bbduk to remove the adapters w/ their respective indexes (before I was clearly just looking for indexes) - hopefully this works.

Do you have another recommendation for removing the adapters with their indexes? FastQC says there's a universal Illumina adapter attached on the 3' end but otherwise the sequences are good quality. Some samples show that they have some overrepresented sequences (the adapters + indexes) and others don't, so I am not exactly sure how to correctly proceed.

Thanks for your help and guidance!

EDIT: used to get rid of adapters with imbedded indexes and it was successful! Double checked my output with FastQC. Thanks again for your input!

ADD REPLYlink modified 24 days ago • written 27 days ago by hannahfreund30

BBDuk is fine and should eliminate all adapters, if this is not happening, then you can post your command so we can check what you are doing.

ADD REPLYlink written 26 days ago by h.mon31k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1274 users visited in the last hour