Question: R1 and R2 matching
gravatar for ieie
21 months ago by
ieie10 wrote:

Dear all,

I have a number of X_R1_001.fastq.gz X_R2_001.fastq.gz and I would to match them. I have seen a pipeline where they give this:

# R1 and R2 matching with for i in Filter-Cutadapt-Demul*.fastq; do perl -f $i -r Filter-Cutadapt-R2.fastq -of Paired-$i -or R2-Paired-$i -os Single-$i; done

but the perl script is not given. Does anybody know how to match the fastq.gz files? I would like trying matching them because when I run the bwa mem I get this:

[M::process] read 66226 sequences (10000126 bp)...[M::process] read 66226 sequences (10000126 bp)..[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 144, 0, 0)[M::mem_pestat] skip orientation FF as there are not enough pairs[M::mem_pestat] analyzing insert size distribution for orientation FR...[M::mem_pestat] (25, 50, 75) percentile: (225, 294, 355)

I am not sure the solution is matching the fastq files, does anybody have a better solution? thanks in advance

bwa next-gen • 964 views
ADD COMMENTlink modified 21 months ago by genomax84k • written 21 months ago by ieie10

What makes you believe that your two files are not matched? Normally R1 and R2 files are already matched, with which I mean the order of the reads are the same in both files.

ADD REPLYlink written 21 months ago by Benn8.0k

thanks! I thought since in BWA it was not finding the pairs there was a problem with the matching.

ADD REPLYlink written 21 months ago by ieie10

bwa seems to find F-R (forward reverse) orientated reads, which is the most logical orientation for paired end reads.

ADD REPLYlink written 21 months ago by Benn8.0k

Hi There,

Can you please explain what you mean by matching reads?

ADD REPLYlink written 21 months ago by Sej Modha4.7k

matching the reads in a way that bwa mem can find the pairs when it runs. Another thing that I don't understand is that after running:

bwa -M -B 4 reference.fas x_R1.fastq.gz  x_R2.fastq.gz > x.sam

and then I run bwa mem only for the R1 file:

bwa -M -B 4 reference.fas x_R1.fastq.gz > x.sam

when I look at both bam files I see a max coverage of 3000 for the R1 alignment and 6000 for the pair ended alignment, is it correct? because since bwa is matching I was expecting the same coverage, 3000.

ADD REPLYlink written 21 months ago by ieie10

Hello ieie ,

it's not clear to me what's your problem. The message you see by bwa mem is just a normal information. It's neither an error nor a warning.

fin swimmer

ADD REPLYlink written 21 months ago by finswimmer13k

My problem is that I was thinking there was an error in bwa while trying to find the pairs. But I understand now that this messages are just saying that it skips the FF or FR orientation because there are not enough pairs but uses another orientation. thanks for your answer.

ADD REPLYlink written 21 months ago by ieie10
gravatar for genomax
21 months ago by
United States
genomax84k wrote:

Use from BBMap suite. It will allow you to properly pair R1/R2 reads in the files and collect signleton reads in a separate file.

You should never trim your R1/R2 reads independently, if that is what led to this problem.

ADD COMMENTlink written 21 months ago by genomax84k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1808 users visited in the last hour