Question: Check the DNA strands source information of paired RNA-Seq fastq data
0
gravatar for ddzhangzz
11 months ago by
ddzhangzz60
United States
ddzhangzz60 wrote:

I would like to check which DNA strands where my RNASeq have come from. I checked a couple of them by browsing UCSC:

R1 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
NTCGAGACTTCTTATAATTTGCATAATCCTCCAAAATGGAATCCACATTCTTCTTGGCAGGAAGATAAAAAAGCTGCTTCTGTCTGGTGATCAAGTCCCAGTCATCCACAAGCCAGGGTTTCAGC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFB

R1 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   124   124 100.0%     9   +   90095138  90096554   1417

R2 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
CTACCGTTGAAAATGAAGAAACCTTCATGAACAGAGTTGAAGTTAAAGTGAAGATCCCTGAAGAGCTGAAACCCTGGCTTGTGGATGACTGGGACTTGATCACCAGACAGAAGCAGCTTTTTTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

R2 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   125   125  99.2%    19   +   41525764  41525888    125

My question is whether this is a good way to check strands information. Is there other a quick and accurate way to check a whole fastq file? Desired output would be how many STRAND + and how many STRAND - from a fastq data.

rna-seq • 286 views
ADD COMMENTlink modified 11 months ago by Devon Ryan78k • written 11 months ago by ddzhangzz60
3
gravatar for Devon Ryan
11 months ago by
Devon Ryan78k
Freiburg, Germany
Devon Ryan78k wrote:
  1. Align your data with STAR or a similar tool.
  2. samtools view -c -f 32 -F 256 foo.bam will give you one strand and samtools view -c -f 16 -F 256 foo.bam the other. Which is which will depend on how you did library prep.
ADD COMMENTlink written 11 months ago by Devon Ryan78k

Thanks @Devon Ryan! Could you have some explanation of the -f and -F options? And I got same numbers from my case:

samtools view -c -f 32 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944
samtools view -c -f 16 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944

What does it suggest?

ADD REPLYlink written 11 months ago by ddzhangzz60

Do you have paired-end reads? That's the most likely explanation for this.

Anyway, -F 256 means "ignore entries that are marked as secondary alignments". -f 32 means "only include alignments that are reverse complemented" and -f 16 means "only include alignments that are not reverse complemented".

ADD REPLYlink written 11 months ago by Devon Ryan78k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1109 users visited in the last hour