Entering edit mode
6.9 years ago
ddzhangzz
▴
90
I would like to check which DNA strands where my RNASeq have come from. I checked a couple of them by browsing UCSC:
R1 Read:
@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
NTCGAGACTTCTTATAATTTGCATAATCCTCCAAAATGGAATCCACATTCTTCTTGGCAGGAAGATAAAAAAGCTGCTTCTGTCTGGTGATCAAGTCCCAGTCATCCACAAGCCAGGGTTTCAGC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFB
R1 UCSC Browse output indicates STRAND +
:
ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq 123 1 124 124 100.0% 9 + 90095138 90096554 1417
R2 Read:
@HISEQ-WALDORF:249:C92LDANXX:8:1101:3161:1988
CTACCGTTGAAAATGAAGAAACCTTCATGAACAGAGTTGAAGTTAAAGTGAAGATCCCTGAAGAGCTGAAACCCTGGCTTGTGGATGACTGGGACTTGATCACCAGACAGAAGCAGCTTTTTTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
R2 UCSC Browse output indicates STRAND +
:
ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq 123 1 125 125 99.2% 19 + 41525764 41525888 125
My question is whether this is a good way to check strands information. Is there other a quick and accurate way to check a whole fastq file? Desired output would be how many STRAND + and how many STRAND - from a fastq data.
Thanks @Devon Ryan! Could you have some explanation of the -f and -F options? And I got same numbers from my case:
What does it suggest?
Do you have paired-end reads? That's the most likely explanation for this.
Anyway,
-F 256
means "ignore entries that are marked as secondary alignments".-f 32
means "only include alignments that are reverse complemented" and-f 16
means "only include alignments that are not reverse complemented".