Question

Check the DNA strands source information of paired RNA-Seq fastq data

0

Entering edit mode

6.9 years ago

ddzhangzz ▴ 90

I would like to check which DNA strands where my RNASeq have come from. I checked a couple of them by browsing UCSC:

R1 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
NTCGAGACTTCTTATAATTTGCATAATCCTCCAAAATGGAATCCACATTCTTCTTGGCAGGAAGATAAAAAAGCTGCTTCTGTCTGGTGATCAAGTCCCAGTCATCCACAAGCCAGGGTTTCAGC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFB

R1 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   124   124 100.0%     9   +   90095138  90096554   1417

R2 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
CTACCGTTGAAAATGAAGAAACCTTCATGAACAGAGTTGAAGTTAAAGTGAAGATCCCTGAAGAGCTGAAACCCTGGCTTGTGGATGACTGGGACTTGATCACCAGACAGAAGCAGCTTTTTTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

R2 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   125   125  99.2%    19   +   41525764  41525888    125

My question is whether this is a good way to check strands information. Is there other a quick and accurate way to check a whole fastq file? Desired output would be how many STRAND + and how many STRAND - from a fastq data.

RNA-Seq • 1.1k views

ADD COMMENT • link updated 4.1 years ago by Biostar 20 • written 6.9 years ago by ddzhangzz ▴ 90

score 3 · Answer 1 · 2017-05-19

3

Entering edit mode

6.9 years ago

Devon Ryan 104k

Align your data with STAR or a similar tool.
samtools view -c -f 32 -F 256 foo.bam will give you one strand and samtools view -c -f 16 -F 256 foo.bam the other. Which is which will depend on how you did library prep.

ADD COMMENT • link 6.9 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks @Devon Ryan! Could you have some explanation of the -f and -F options? And I got same numbers from my case:

samtools view -c -f 32 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944
samtools view -c -f 16 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944

What does it suggest?

ADD REPLY • link 6.9 years ago by ddzhangzz ▴ 90

0

Entering edit mode

Do you have paired-end reads? That's the most likely explanation for this.

Anyway, -F 256 means "ignore entries that are marked as secondary alignments". -f 32 means "only include alignments that are reverse complemented" and -f 16 means "only include alignments that are not reverse complemented".

ADD REPLY • link 6.9 years ago by Devon Ryan 104k