Question: Check the DNA strands source information of paired RNA-Seq fastq data
0
gravatar for ddzhangzz
3 months ago by
ddzhangzz40
United States
ddzhangzz40 wrote:

I would like to check which DNA strands where my RNASeq have come from. I checked a couple of them by browsing UCSC:

R1 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
NTCGAGACTTCTTATAATTTGCATAATCCTCCAAAATGGAATCCACATTCTTCTTGGCAGGAAGATAAAAAAGCTGCTTCTGTCTGGTGATCAAGTCCCAGTCATCCACAAGCCAGGGTTTCAGC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFB

R1 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   124   124 100.0%     9   +   90095138  90096554   1417

R2 Read:

@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
CTACCGTTGAAAATGAAGAAACCTTCATGAACAGAGTTGAAGTTAAAGTGAAGATCCCTGAAGAGCTGAAACCCTGGCTTGTGGATGACTGGGACTTGATCACCAGACAGAAGCAGCTTTTTTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

R2 UCSC Browse output indicates STRAND +:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq          123     1   125   125  99.2%    19   +   41525764  41525888    125

My question is whether this is a good way to check strands information. Is there other a quick and accurate way to check a whole fastq file? Desired output would be how many STRAND + and how many STRAND - from a fastq data.

rna-seq • 130 views
ADD COMMENTlink modified 3 months ago by Devon Ryan70k • written 3 months ago by ddzhangzz40
3
gravatar for Devon Ryan
3 months ago by
Devon Ryan70k
Freiburg, Germany
Devon Ryan70k wrote:
  1. Align your data with STAR or a similar tool.
  2. samtools view -c -f 32 -F 256 foo.bam will give you one strand and samtools view -c -f 16 -F 256 foo.bam the other. Which is which will depend on how you did library prep.
ADD COMMENTlink written 3 months ago by Devon Ryan70k

Thanks @Devon Ryan! Could you have some explanation of the -f and -F options? And I got same numbers from my case:

samtools view -c -f 32 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944
samtools view -c -f 16 -F 256 C92LDANXX_s8_1_B01_0249_SL152940Aligned.out.bam 
19770944

What does it suggest?

ADD REPLYlink written 3 months ago by ddzhangzz40

Do you have paired-end reads? That's the most likely explanation for this.

Anyway, -F 256 means "ignore entries that are marked as secondary alignments". -f 32 means "only include alignments that are reverse complemented" and -f 16 means "only include alignments that are not reverse complemented".

ADD REPLYlink written 3 months ago by Devon Ryan70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 497 users visited in the last hour