Question: From sam data, how do I tell from which fastq file each read came from?
gravatar for bitjunkie
20 months ago by
United States
bitjunkie40 wrote:

I already know that you can use the first and second in pair information from the sam FLAG, but it works only if you have one pair of paired-read files. I need to do this for multiple pairs and/or single end reads. Any one know an efficient work around? Thanks!

ADD COMMENTlink written 20 months ago by bitjunkie40

The read group ID sometimes encodes which fastq file the read came from. Check how many read group IDs you have and see if it matches the number of fastq files you expect.

ADD REPLYlink written 20 months ago by Samuel Brady280

You could look at the lane numbers and flowcell ID's encoded in the fastq header to identify where a particular read came from. If the reads are paired or not can only be figured out if the aligner kept the part after the space in fastq header which denotes read 1/2 along with the index sequence.

ADD REPLYlink written 20 months ago by genomax64k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1072 users visited in the last hour