Question: Determining Orientation Of Fastq Files
0
gravatar for dan79
7.2 years ago by
dan7990
dan7990 wrote:

Hey everyone. I have a bam file which I converted to sam (via samtools) and then used Picard's SamToFastq script and split the bam into two fastq files. I am now trying to run the de novo assembler Trinity on these files. I know they are strand specific, but I do not know their orientation. Are they forward-reverse orientation or reverse-forward? Which strand is the left and which is the right? Does anyone know how to determine this?

trinity fastq bam sam • 5.1k views
ADD COMMENTlink written 7.2 years ago by dan7990
0
gravatar for Arun
7.2 years ago by
Arun2.3k
Germany
Arun2.3k wrote:

I guess it depends on the method used for generating strand specificity. For dUTP method, I believe its first-strand reverse (hence, reverse-forward). However, since you have mapped your files already, from the flag of every pair or just a couple pairs, you can check if the read is first in pair or second in pair and if the read is reverse oriented or not, isn't it?

ADD COMMENTlink modified 7.2 years ago • written 7.2 years ago by Arun2.3k

Sorry I meant to comment on your answer not create my own. I am not sure what you mean, unless you replace "isn't it?" with "can't you?". Here is an example, read1: @HS5_233:5:1103:12424:50065/1 GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAACGCAACTCCGCCGTCGCAAAGGCGCCGCGCCGGCGCAGACG + ??@DD>;BF?FHFGAFFBHED>A>@;6;1>@?9(8;/383:48>@>>>B#########################

read2: @HS5_233:5:1103:12424:50065/2 GCCGCCCCACCAACCCCCCCCCGTACGCGGGCGTCTCCGCGGCCGGGCCACCCCGCCCGCCCCCTCGACGCGCCCGCCGGAGTATCTGGTCCTGCGCCCG + =11++)0@F###########################################################################################

ADD REPLYlink written 7.2 years ago by dan7990

Are your original FASTQ files strand-specific and paired end? The FASTQ file converted back from SAM will not have any information about read orientation. Given that your reads came from SS protocol, in your sam file, the 2nd column => flag should tell you all the reads that belong in first pair of your fastq and all the reads that are in the 2nd pair of your fastq. You should split your sam files based on the flag to 2 sam files and picard samToFastq on each of them separately.

In short, post from your SAM file, a properly paired read and its pair.

ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by Arun2.3k
0
gravatar for swbarnes2
7.2 years ago by
swbarnes26.2k
United States
swbarnes26.2k wrote:

By converting the reads out of .sam format, you have gotten rid of all the mapping information, like orientation.

If you are expecting every read of read 1 to be in one orientation, that's generally not true, unless the sample was prepped in a strand-specific way.

Either way, eyeball a portion of the .sam, and look at the binary flags to learn the orientation of some of your reads. You could split your .bam so that one half has all the forward reads, and one has all the reverse reads, if that's what you want.

ADD COMMENTlink written 7.2 years ago by swbarnes26.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1077 users visited in the last hour