Losing most PE reads using Cutadapt on 150bp PE metagenomes
Entering edit mode
12 months ago

Hi everyone,

I am using cutadapt to cut some adapters on my PE, metagenome reads (150bpx2, Illumina NovaSeq).

Here is my code:

for i in *_R1_001.fastq;
    SAMPLE=$(echo ${i} | sed "s/_R1_\001\.fastq//")
    echo ${SAMPLE}_R1_001.fastq ${SAMPLE}_R2_001.fastq/n
    cutadapt --pair-adapters -pair-filter=both --discard-untrimmed -b ACTGCGAA -B TTCGCAGT -o ${SAMPLE}_R1_001_cut.fastq -p ${SAMPLE}_R2_001_cut.fastq  ${SAMPLE}_R1_001.fastq ${SAMPLE}_R2_001.fastq**


I've done this a few ways (keeping untrimmed sequences, not keeping them, etc) - and for whatever reason, I am keeping only 0.2% of my reads (will attach picture of results).

I ran the sequences through FastQC and every single metagenome is great quality - so I am not sure why I am losing so many reads!

Does anyone know why this could be happening, and how to improve the reads I am keeping?

I was thinking of trying trimmomatic but I am not sure how to go about using custom adapters for trimmomatic.

metagenome cutadapt pairedend PE shotgun • 415 views
Entering edit mode

you are discarding untrimmed reads, perhaps your reads don't contain the adapters

Entering edit mode
12 months ago
h.mon 33k

As Istvan Albert said, you are discarding untrimmed reads, so all reads that do not contain the adapter will be removed - in general, this is useful when one is removing primers, not adapters.

Besides, it seems to me the sequences -b ACTGCGAA -B TTCGCAGT are barcodes, not adapters. Unless these are inline barcodes, they were sequenced during specific cycles in Illumina machines, are used to demultiplex a sequencing run, and are output to separate fastq files.

Entering edit mode

Thank you for pointing that out, you're right.

I am trying to use bbduk to remove the adapters w/ their respective indexes (before I was clearly just looking for indexes) - hopefully this works.

Do you have another recommendation for removing the adapters with their indexes? FastQC says there's a universal Illumina adapter attached on the 3' end but otherwise the sequences are good quality. Some samples show that they have some overrepresented sequences (the adapters + indexes) and others don't, so I am not exactly sure how to correctly proceed.

Thanks for your help and guidance!

EDIT: used bbduk.sh to get rid of adapters with imbedded indexes and it was successful! Double checked my output with FastQC. Thanks again for your input!

Entering edit mode

BBDuk is fine and should eliminate all adapters, if this is not happening, then you can post your command so we can check what you are doing.


Login before adding your answer.

Traffic: 2171 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6