Question: BWA paired reads have different names error
0
gravatar for crysis405
4.6 years ago by
crysis40530
United Kingdom
crysis40530 wrote:
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (144, 213, 328)
[infer_isize] low and high boundaries: 100 and 696 for estimating avg and std
[infer_isize] inferred external isize from 111344 pairs: 247.404 +/- 131.705
[infer_isize] skewness: 1.110; kurtosis: 0.633; ap_prior: 2.92e-05
[infer_isize] inferred maximum insert size: 1103 (6.50 sigma)
[bwa_sai2sam_pe_core] time elapses: 10.79 sec
[bwa_sai2sam_pe_core] changing coordinates of 6435 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 29497 out of 36462 Q17 singletons are mated.
[bwa_paired_sw] 1384 out of 8191 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 4.24 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 1.88 sec
[bwa_sai2sam_pe_core] print alignments... [bwa_sai2sam_pe_core] paired reads have different names: "I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286", "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286" 

Using Version: 0.7.9a-r786

Can anyone shed some light on what might be causing this error? Stampy had not problem with the exact same files.

 

EDIT:

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq

9:@I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/1

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq:

9:@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/2

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq:

+
CCCFFFFFHHHHHJJJJJJJJJJJJJIIJJJJJJIIJJJJJJJJJIJJJJJJJJJBDHHIJJJJJJHHHHHHFFFFFFEEEEEEDDDDDDDDDDCDEECC
@I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/1

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq:

+
BBCFFFF;FHHHHJHIJGHIJJJJJJGGJJIJ?F?BFGGGHGJJJJJIIIHGHDFF@DDD9@ABBDDBC@ACDDD>AB9@D?BCCDADEEEDDDCCDCC@
@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/2
alignment • 6.1k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by crysis40530

Run these commands on your pair of fastq files and paste the output. 

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq

 

 

 

ADD REPLYlink written 4.6 years ago by Ashutosh Pandey11k

Added the output

ADD REPLYlink written 4.6 years ago by crysis40530

There is no issue with the ordering of the read pairs in two files. The issue is related to the name of the read id. I am sure you have figured it out by now. Correct the read id or read name and run the aligner again. I would just make sure that that extra "I@" doesnt belong to the quality score string of the previous read.  

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Ashutosh Pandey11k
0
gravatar for Istvan Albert
4.6 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

You have an I@ symbol in the read name for read1

ILLUMINA:381:D1HHHACXX:1:2312:14474:27286

That's pretty strange. Investigate why that happened.

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Istvan Albert ♦♦ 79k

Yeah, don't know why only 1 out of 103 files would suddenly have I@ incorporated. I was thinking just deleting it and seeing if that worked.

ADD REPLYlink written 4.6 years ago by crysis40530

Looks like the I@ was produced by Picardtools SamToFastq when using INTERLEAVE=TRUE

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by crysis40530
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1205 users visited in the last hour