Duplicate read identifiers across multiple samples
0
0
Entering edit mode
4.0 years ago
Chris Dean ▴ 410

I have paired-end sequence reads that were sequenced on the Illumina HiSeq 2500 system. In a subset of samples, I noticed that a small proportion of FASTQ entries have duplicate read identifiers, sequences, quality scores and barcodes.

sample1:@HISEQ:664:HYGKJBCXY:2:2104:21100:19520 1:N:0:CTGAGCCA
sample2:@HISEQ:664:HYGKJBCXY:2:2104:21100:19520 1:N:0:CTGAGCCA
sample3:@HISEQ:664:HYGKJBCXY:2:2104:21100:19520 1:N:0:CTGAGCCA
sample4:@HISEQ:664:HYGKJBCXY:2:2104:21100:19520 1:N:0:CTGAGCCA
sample5:@HISEQ:664:HYGKJBCXY:2:2104:21100:19520 1:N:0:CTGAGCCA

Libraries were prepared and sequenced by a third-party company. Does anyone have a possible explanation for this and suggestions for moving forward?

Thanks, Chris

sequencing next-gen • 581 views
ADD COMMENT
0
Entering edit mode

They messed us the demultiplexing somehow. I'm guessing each sample should have a different barcode, do you know which one corresponds to the barcode in the header? I would ask them to rerun bcl2fastq and regenerate the fastq files.

ADD REPLY

Login before adding your answer.

Traffic: 1444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6