How to recognize single-end fastq files
2.6 years ago
Hi, I have fastq files which are all R1 but I want to be sure that they are single-end sequencing ( in case R2 was misplaced for example) I'm unsure if for single-end the 1 from this code 1:N:0:GGAGAA represents one of a pair?? Here is what R1 file looks like:

+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<EEEEEEEEEEEEAEEE
@NS500449:529:HYW33BGX5:1:11101:12403:1046 1:N:0:GGAGAA
CGGAANTTTACGTTCGCTGTGTTTGTGTACGACACAAAGTTTGTGCCAAGTTTTTTAAATCAATTACATTTTTGA

+
AAAAA#6AEE/EEEEEEEEE6EEEEEEEEAEEAEEEEEEEA<<E/AEEAEA/EEAEE//EEEEEEEAAEEEE<E<
@NS500449:529:HYW33BGX5:1:11101:8275:1047 1:N:0:GGAGAA
CGAAGNCCTTGGTTCAAAAGGTTCTGGAGTTCGTGGATGAGCATGGAACCGAGGTGCTTAACCTGGGCAGCTTCA

+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEEAE
@NS500449:529:HYW33BGX5:1:11101:19612:1048 1:N:0:GGAGAA
GCAGCNGGCAATAATCTTAATGCCCATGGAATTACTCCGGATAAAGATCTGCCAAAGTCCACGGCAATGCCCAGCA


in case R2 was misplaced

No way to know for sure unless one knew the sample was sequenced as paired-end to begin with.

So there is no way to know just by looking at the file?? That can not be right, not surprising, this information should be encoded in the file!

That information is in the file. 1:N:0:GGAGAA says this is read 1. If it is on its own, it is single-end data (without 2:N:0:GGAGAA to go with it).

2.6 years ago

I'm unsure if for single-end the 1 from this code 1:N:0:GGAGAA represents one of a pair??

Nope. My single end sequencing has the same thing.