Cannot repair BWA paired reads that have different name
1
0
Entering edit mode
7.1 years ago
Alex ▴ 10

The error that I am receiving is:

[bwa_sai2sam_pe_core] paired reads have different names: "SRR567550.1.1", "SRR567550.1.2"

I think that the problem is the '.1' and '.2' and have attempted to resolve this using repair.sh from BBmap, as in bwa sampe: paired ends with different names. Unfortunately my fastq files are both ~8 GB and java runs out of memory even when allocating the maximum memory that I can and setting compression to 9 (max).

One idea that I have had is to just take the first column using awk, but this will result in information loss.

I am unsure how to resolve this. Is there a quick/efficient way to strip the '.1' and '.2' while preserving information?

BWA alignment • 1.8k views
ADD COMMENT
2
Entering edit mode
7.1 years ago
GenoMax 141k

Download the original fastq format files for this sample from EBI-ENA here and avoid this issue altogether.

If you used fastq-dump to dump the reads out then you should have used -F option to get the original read headers without this extra stuff from SRA.

ADD COMMENT
0
Entering edit mode

Thank you, I didn't know about either of these options.

ADD REPLY

Login before adding your answer.

Traffic: 2524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6