BWA mem paired end read name errors
0
0
Entering edit mode
3.8 years ago
dp ▴ 50

When using the latest version (0.7.17) of bwa mem it seems like paired-end reads (split in 2 files) must have names like read1/1 and read1/2 (using the "/" character), while an older version I used (0.7.5) also allows names like read1_1 and read1_2 or read1.1 and read1.2 .

Why is this the case in the latest version? What is the most up to date version that supports these different naming conventions? Is there a way to prevent the most recent bwa mem from erroring out on reads that use these naming conventions?

bwa bwa mem paired end reads • 1.6k views
ADD COMMENT
0
Entering edit mode

as far as I can see, the code trimming the fastq is still here: https://github.com/lh3/bwa/blob/master/bwaseqio.c#L209

    { // trim /[12]$
        int t = strlen(p->name);
        if (t > 2 && p->name[t-2] == '/' && (p->name[t-1] == '1' || p->name[t-1] == '2')) p->name[t-2] = '\0';
    }
ADD REPLY
0
Entering edit mode

Thanks for the quick response. I'm not sure I completely understood this code, but it looks like it specifically is trimming on the "/" character, so this won't happen on a "." or "_" character.

It seems that some read files use the "." or "_" instead of "/", and I am able to run bwa mem v0.7.5 on them. But in v0.7.17 I get the error that "paired reads have different names" since they are not properly trimmed.

I'm not sure why all the options can't be supported in the latest release.

ADD REPLY
0
Entering edit mode

Thanks for the quick response. I'm not sure I completely understood this code, but it looks like it specifically is trimming on the "/" character, so this won't happen on a "." or "_" character.

exactly

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Maybe there is somewhere else in the old version that handles the other formats?

ADD REPLY
0
Entering edit mode

Or rather - maybe the old version doesn't have a check for whether the last part of the name is the same?

ADD REPLY
0
Entering edit mode

The real question is what program or instrument has produced these reads with unusual names like read1_1 or read1.1.

ADD REPLY
0
Entering edit mode

It's Illumina HiSeq 2000

ADD REPLY

Login before adding your answer.

Traffic: 2481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6