BBMap/sh/repair.sh not finding pairs
1
0
Entering edit mode
14 months ago
Eliott • 0

Hi,

I am a beginner bioinformatician, I have paired-end fastq.gz files (file1 for R1 and file2 for R2) from illumina. I want to use these files for downstream analysis but I cannot because they don't have the same number of reads. (the difference between read 1 and read 2 is made by "_1", "_2" in the files).

I therefore used repair.sh script from BBMap to delete reads that are only in one of the two files.

repair.sh ran and found reads but could not find pairs (see output). Then the output files are empty.

I tried different parameters but nothing worked...

How can I repair my files easily? Or how can I make repair.sh work?

Thanks!!

enter image description here

BBMap fastq repair paired-end repair.sh • 755 views
ADD COMMENT
0
Entering edit mode

At least two things are missing from your post: the exact repair command you used, and some content from your reads.

You may have mismatched files that are not mate pairs. If that is the case, there is nothing that can be done. A simple way of ruling that out is by typing head read1.fastq and head read2.fastq and showing us the output of those commands.

ADD REPLY
1
Entering edit mode
14 months ago

When running repair.sh to fix pairs with

repair.sh in1=broken1.fq in2=broken2 out1=fixed1.fq out2=fixed2.fq outs=singletons.fq repair 

it will reassign pairs based on the read names if they are:

  • Illumina format (identical prefix followed by 1: and 2:, or by /1 and /2)
  • Completely identical for both reads in a pair

Since your pairs are designated with underscore, that doesn't work. You have to preprocess your FastQs with

sed -i '.backup'  "s/_\([12]\)/\/\1/g"  *.fq
ADD COMMENT

Login before adding your answer.

Traffic: 2566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6