[mem_sam_pe] paired reads have different names. Without -p
3
1
Entering edit mode
3.8 years ago
windsur ▴ 20

Hello, I am newbi in processing fasq files, but it is the first time that happened to me. I downloaded the L00N_RN_00N.fastq.gz, and joined in two fastq files (i.e. _R1.fastq.gz and _R2.fastq.gz). Then I load the genome reference to memory and I make a loop samples for BWA, but today I have an error message:

> [mem_sam_pe] paired reads have different names

And this is my code that I used.

call('bwa mem -t' + str(args.threads) + ' -R "@RG\tID:' + sample_name + '\tLB:library\tPL:illumina\tPU:library\tSM:' + sample_name + '" ' + genome_ref + ' ' + forward_paths[i] + ' ' + reverse_paths[i] + ' > ' + sample_path + '/' + sample_name + '_bwa.sam',shell = True)

I am quite confuse, because I have analyse several samples as always but I've never seen that. Any help is welcome! :) I downloaded the files again and joined them.

bwa alignment next-gen • 3.1k views
ADD COMMENT
0
Entering edit mode

Nice! I'm going to check it right now. Thanks! If it is "0", what should I do?

ADD REPLY
0
Entering edit mode
3.8 years ago

its's not a problem with bwa, it's a problem with

 forward_paths[i] + ' ' + reverse_paths[i]

your fastq files are not the correct pair R1/R2 or the sort order of the read is not the same between R1 and R2, or there are some empty reads in your fastqs

test:

gunzip -c name.R1.fastq.gz | paste - - - - | cut -f 1| head
gunzip -c name.R2.fastq.gz | paste - - - - | cut -f 1| head

and

gunzip -c name.R1.fastq.gz name.R2.fastq.gz  | paste - - - - | cut -f 2| grep  -E '^$' | wc -l

must be '0'

ADD COMMENT
0
Entering edit mode

Thank you! I just checked: For the first one (R1):

@NS500387:143:HVKMFAFXX:1:11101:14111:1058 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:6775:1061 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:21397:1063 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:11426:1064 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:10515:1066 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:3391:1066 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:11290:1066 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:5843:1071 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:8819:1077 1:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:6162:1078 1:N:0:TAGGCATG+NTCCTTAC

and (R2):

@NS500387:143:HVKMFAFXX:1:11101:14111:1058 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:6775:1061 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:21397:1063 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:11426:1064 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:10515:1066 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:3391:1066 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:11290:1066 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:5843:1071 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:8819:1077 2:N:0:TAGGCATG+NTCCTTAC
@NS500387:143:HVKMFAFXX:1:11101:6162:1078 2:N:0:TAGGCATG+NTCCTTAC

the last one return me the value of "0".

ADD REPLY
0
Entering edit mode

@PierreLindenbaum this is the message that I get:

[mem_sam_pe] paired reads have different names: "NS500387:143:HVKMFAFXX:1:11101:21856:1052", "NS500387:143:HVKMFAFXX:2:11101:12190:1028"
ADD REPLY
0
Entering edit mode

Hi @PierreLindenbaum Sorry to disturb you for this old question. I got some fastq from my PI, then I fastqc them showing no adapter.I am not sure how they library and trim, maybe so do my PI. Besides, when I bwa-men without -p it shows [mem_sam_pe] paired reads have different names then I checked them:

$ gunzip -c E1_input.fq.gz | paste - - - - | cut -f 1| head
@CL100056099L2C001R002_24
@CL100056099L2C001R002_40
@CL100056099L2C001R002_45
@CL100056099L2C001R002_73
@CL100056099L2C001R002_74
@CL100056099L2C001R002_81
@CL100056099L2C001R002_90
@CL100056099L2C001R002_91
@CL100056099L2C001R002_95
@CL100056099L2C001R002_103
$ gunzip -c E1_pulldown.fq.gz | paste - - - - | cut -f 1| head
@CL100056099L2C001R002_61
@CL100056099L2C001R002_115
@CL100056099L2C001R002_154
@CL100056099L2C001R002_218
@CL100056099L2C001R002_228
@CL100056099L2C001R002_253
@CL100056099L2C001R002_255
@CL100056099L2C001R002_269
@CL100056099L2C001R044_184179
@CL100056099L2C001R002_305
$ gunzip -c E1_input.fq.gz E1_pulldown.fq.gz  | paste - - - - | cut -f 2| grep  -E '^$' | wc -l
0

Then I bwa-men with -p it succeed. I am struggling about why there are not the correct R1/R2 or I am not sure whether they are really paired-end fastq. And in the following steps I would like to samtools to sorted bam then ATACseqQC. Cause it become one E1.sam after bwa-men with -p I am not sure whether it will affect the following results.or I just bwa aln for single-end reads?or what is the difference between bwa-men with -p for paired-end and bwa-aln for single-end?Thank you so much for this stupid question. Could you please give me some advice about my following analysis?Thank you so much in advance.

ADD REPLY
0
Entering edit mode
gunzip -c sample.fastq.gz | sed -E 's/(^[@+]SRR[0-9]+\.[0-9]+)\.[12]/\1/' | gzip -c > sample.fixed.fastq.gz
ADD REPLY
0
Entering edit mode
3.8 years ago
windsur ▴ 20

Solved! It wasn't a problem of my script. The problem was when I've tried downloading the files through Basespace of Illumina, for some reason, there was a connection problem and the files was "corrupted". what I did is downloading the files and then join the fastq files (separated).

ADD COMMENT
0
Entering edit mode
9 months ago
dare_devil ★ 1.5k

I had similar kind of error. I tried following to fix that error:

gunzip -c sample.fastq.gz | sed -E 's/(^[@+]SRR[0-9]+\.[0-9]+)\.[12]/\1/' | gzip -c > sample.fixed.fastq.gz
ADD COMMENT

Login before adding your answer.

Traffic: 3229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6