Determine where an interleaved FASTQ record starts
0
0
Entering edit mode
11 weeks ago
ole.tange ★ 4.1k

FASTQ-files have a record length of 4 lines. But you can also determine where a record starts even in the middle of a file by looking at '@' and lines around that (see https://stackoverflow.com/a/41707920/363028).

Can we do something similar with interleaved FASTQ-files?

Based on https://stackoverflow.com/a/68707816/363028: is there something that tells us where an interleaved FASTQ-record starts?

@M10991:61:000000000-A7EML:1:1101:14011:1001 1:N:0:28
NGCTCCTAGGTCGGCATGATGGGGGAAGGAGAGCATGGGAAGAAATGAGAGAGTAGCAA
+
#8BCCGGGGGFEFECFGGGGGGGGG@;FFGGGEG@FF<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1101:14011:1001 2:N:0:28
NGCTCCTAGGTCGGCATGACGCTAGCTACGATCGACTACGCTAGCATCGAGAGTAGCAA
+
#8BCCGGGGGFEFECFGGGGGGGGG@;FFGGGEG@FF<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1201:15411:3101 1:N:0:28
NGCTCCTAGGTCGGCATGATGGGGGAAGGAGAGCATGGGAAGAAATGAGAGAGTAGCAA
+
#8BCCGGGGGFEFECFGGGGGGGGG@;FFGGGEG@FF<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1201:15411:3101 2:N:0:28
CGCTAGCTACGACTCGACGACAGCGAACACGCGATCGATCGGAAATGAGAGAGTAGCAA
+
#8BCCGGGGGFEFECFGGGGGGGGG@;FFGGGEG@FF<EE<@FFC,CEGCCGGFF<FGF

In the above example you can use the '@' trick combined with '.* 1:N' to determine this seqname is of a R1. But does this always work? And if not: Is there something else that can tell us, whether a FASTQ-record is for R1 or R2?

fastq • 143 views
ADD COMMENT
0
Entering edit mode

That is correct, for Illumina latest ones where R1 is denoted by 1 and R2 is denoted by 2. But you can also find illumina sequences for R1 and R2 like: @K00193:38:H3MYFBBXX:4:1101:10003:44458/1 (for R1) and @K00193:38:H3MYFBBXX:4:1101:10003:44458/2 (for R2). Some developers do not even care 1 or 2. Logic is there would be a block of 8 lines, within 8 line block, first 4 line block belongs to R1 and second 4 line block belongs to R2.

Please refer to wiki for fastq format definition for general understanding.

ADD REPLY

Login before adding your answer.

Traffic: 1839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6