From sam data, how do I tell from which fastq file each read came from?
0
0
Entering edit mode
6.7 years ago
bitjunkie ▴ 40

I already know that you can use the first and second in pair information from the sam FLAG, but it works only if you have one pair of paired-read files. I need to do this for multiple pairs and/or single end reads. Any one know an efficient work around? Thanks!

sam parsing read file origin bowtie2 • 1.6k views
ADD COMMENT
1
Entering edit mode

The read group ID sometimes encodes which fastq file the read came from. Check how many read group IDs you have and see if it matches the number of fastq files you expect.

ADD REPLY
0
Entering edit mode

You could look at the lane numbers and flowcell ID's encoded in the fastq header to identify where a particular read came from. If the reads are paired or not can only be figured out if the aligner kept the part after the space in fastq header which denotes read 1/2 along with the index sequence.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6