Bowtie2: Error, fewer reads in file specified with -1 than in file specified with -2
2
0
Entering edit mode
12 months ago
zdiazmar ▴ 30

Hi all,

This is my first time attempting to align sequences to a reference index. I am using bowtie2 with the -1 and -2 arguments and have gotten the following error message:

Error, fewer reads in file specified with -1 than in file specified with -2
terminate called after throwing an instance of 'int'
(ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)

The bowtie2 manual states that

Sequences specified with this option must correspond file-for-file and read-for-read with those specified in <m2>

I understand that my error message means that one or more of my R1 files have fewer reads than the associated R2 files so that they likely don't correspond read-for-read.

How can I ensure that my R1 and R2 files correspond to one another read-for-read? Currently, the fq.gz files I am working with have been trimmed to remove adapters and trimmed to 150 bp. I am concerned that this trimming step is what has caused the error with bowtie2, but I am not sure I feel comfortable using the untrimmed sequences.

Any relevant thoughts would be welcome - thanks for your help!

alignment bowtie2 • 1.6k views
ADD COMMENT
2
Entering edit mode
12 months ago
GenoMax 141k
${sample}.R1.paired.trimmed.fq.gz ${sample}.R2.paired.trimmed.fq.gz \
    ${sample}.R1.unpaired.trimmed.fq.gz ${sample}.R2.unpaired.trimmed.fq.gz \

Looks like your order of file options is incorrect. trimmomatic manual shows the following

 <paired output 1> <unpaired output 1> <paired output 2> <unpaired output 2>
ADD COMMENT
1
Entering edit mode

Oh no - how embarrassing to make such a simple mistake! You are correct and that was my issue. I re-ran trimmomatic with the correction and bowtie2 is now working. Thank you for your time!

enter image description here

ADD REPLY
1
Entering edit mode

No problem. Happens to all of us at some point.

That is why I prefer bbduk.sh from BBMap suite. Easy to understand options and you can't go wrong with

bbduk.sh -Xmx2g in1=R1.fq.gz in2=R2.fq.gz out1=R1_trim.fq.gz out2=R2_trim.fq.gz
ADD REPLY
0
Entering edit mode
12 months ago
GenoMax 141k

I am concerned that this trimming step is what has caused the error with bowtie2

That is more than likely. You should always trim your sequences files together (never do them independently). That said you can bring them back in sync by using repair.sh from BBMap suite (which will extract the singletons remaining).

repair.sh -Xmx10g in1=Sample_R1_001.fastq.gz  in2=Sample_R2_001.fastq.gz out1=Sample_fixed_R1_001.fastq.gz out2=Sample_fixed_R2_001.fastq.gz outs=Sample_single.fastq,gz repair

I am not sure I feel comfortable using the untrimmed sequences.

While it is good to be concerned in most cases aligners will soft-clip bases that do not map so it will still work. You would want to remove all extraneous sequence IF you are planning to do any de novo work.

ADD COMMENT
0
Entering edit mode

I would rather repeat the trimming with a proper tool as this must not happen. You probably trimmed R1 and R2 independently but for PE data must use a paired-end aware trimmer such as fastp, cutadapt or bbduk from mentioned BBMap suite.

ADD REPLY
0
Entering edit mode

Thanks ATpoint and GenoMax ! For the trimming step, I used trimmomatic v.0.39. I did specify paired end trimming (PE argument in trimmomatic) and also gave the program the R1 and R2 sequence files. Additionally, this program outputs paired fq.gz files, which is what I then used in bowtie2. I have seen other folks recommend programs like cutadapt or bbmap over trimmomatic, so maybe I will start there!

ADD REPLY
0
Entering edit mode

Again, that should not happen that files have different number of reads. Did the job get killed during the run. trimmomatic a very common tool that gets this job done usually.

ADD REPLY
0
Entering edit mode

I just re-ran trimmomatic and am getting the same error with bowtie2. Below is the loop I've written to trim my sequences. I have a couple hundred samples, so I've removed many of them, but kept a few so that you can get an idea of the loop. Am I doing something wrong here?

samples="CB-03-02_S178
CB-99-16_S28
CB-99-22_S73
CB-99-61_S164"

i=1
for sample in $samples
do
    java -jar /software/trimmomatic/0.39/trimmomatic-0.39.jar PE -phred33 -trimlog trim.log \
    ${sample}_R1_001.fastq.gz ${sample}_R2_001.fastq.gz \
    ${sample}.R1.paired.trimmed.fq.gz ${sample}.R2.paired.trimmed.fq.gz \
    ${sample}.R1.unpaired.trimmed.fq.gz ${sample}.R2.unpaired.trimmed.fq.gz \
    ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 CROP:150
    let "i+=1";
done
ADD REPLY

Login before adding your answer.

Traffic: 2854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6