Check the DNA strands source information of paired RNA-Seq fastq data
1
0
Entering edit mode
6.9 years ago
ddzhangzz ▴ 90

I would like to check which DNA strands where my RNASeq have come from. I checked a couple of them by browsing UCSC:

R1 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
NTCGAGACTTCTTATAATTTGCATAATCCTCCAAAATGGAATCCACATTCTTCTTGGCAGGAAGATAAAAAAGCTGCTTCTGTCTGGTGATCAAGTCCCAGTCATCCACAAGCCAGGGTTTCAGC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFB

R1 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   124   124 100.0%     9   +   90095138  90096554   1417

R2 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
CTACCGTTGAAAATGAAGAAACCTTCATGAACAGAGTTGAAGTTAAAGTGAAGATCCCTGAAGAGCTGAAACCCTGGCTTGTGGATGACTGGGACTTGATCACCAGACAGAAGCAGCTTTTTTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

R2 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   125   125  99.2%    19   +   41525764  41525888    125

My question is whether this is a good way to check strands information. Is there other a quick and accurate way to check a whole fastq file? Desired output would be how many STRAND + and how many STRAND - from a fastq data.

RNA-Seq • 1.1k views
ADD COMMENT
3
Entering edit mode
6.9 years ago
  1. Align your data with STAR or a similar tool.
  2. samtools view -c -f 32 -F 256 foo.bam will give you one strand and samtools view -c -f 16 -F 256 foo.bam the other. Which is which will depend on how you did library prep.
ADD COMMENT
0
Entering edit mode

Thanks @Devon Ryan! Could you have some explanation of the -f and -F options? And I got same numbers from my case:

samtools view -c -f 32 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944
samtools view -c -f 16 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944

What does it suggest?

ADD REPLY
0
Entering edit mode

Do you have paired-end reads? That's the most likely explanation for this.

Anyway, -F 256 means "ignore entries that are marked as secondary alignments". -f 32 means "only include alignments that are reverse complemented" and -f 16 means "only include alignments that are not reverse complemented".

ADD REPLY

Login before adding your answer.

Traffic: 2678 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6