Question: How to interpret the difference among these three options in strandedness from HTSeq-count
0
gravatar for nalandaatmi
3.2 years ago by
nalandaatmi40
United States
nalandaatmi40 wrote:

Dear All,

I am interested in calculating the % of reads associated to globin gene and rRNA genes. Right now, I am not sure whether my paired end RNAseq data has followed strand specific protocol or not. I requested the incharge person to inform me. 

Meanwhile, I selected all the three options for strandedness (no,yes,reverse) in htseq-count.

How do I get the strand (sense,antisense) information 

How to interpret the Stranded:Reverse counts

Globin genes Stranded:No Stranded:Yes Stranded:Reverse
HBB 40204 40197 7
HBA1 38811 38795 16
HBA2 129847 129770 77
HBG1 1566 1566 0
HBG2 2750 2750 0
HBD 3 3 0
HBE1 1 0 1
HBZ 0 0 0
HBQ1 9 3 6
MB 4 0 4
CYGB 294 2 354
NGB 289 2 319

How to interpret the difference among these three options

Stats from special counters

Special counters Stranded:No Stranded:Yes Stranded:Reverse
__no_feature 56289350 94180089 56914563
__ambiguous 625347 18161 343824
__too_low_aQual 0 0 0
__not_aligned 0 0 0
__alignment_not_unique 30631662 30631662 30631662
ADD COMMENTlink modified 3.2 years ago by Alternative220 • written 3.2 years ago by nalandaatmi40
2
gravatar for Alternative
3.2 years ago by
Alternative220
Alternative220 wrote:

Check the following explanation:

http://onetipperday.blogspot.de/2012/07/how-to-tell-which-library-type-to-use.html

Also, RSeqQC have a script that counts the different cases and tells you what type of stranded library you have. Check their infer_experiment.py script (I think I tried it long time ago but don't remember how accurate it is but should be ok).

http://rseqc.sourceforge.net/#infer-experiment-py

 

ADD COMMENTlink written 3.2 years ago by Alternative220
1
gravatar for Biomonika (Noolean)
3.2 years ago by
State College, PA, USA
Biomonika (Noolean)3.0k wrote:

Take your file with mapped reads (bam or sam) and open it in IGV. Then right click and choose coloring by first-in-pair read strand. If all genes are colored with the same color (either pink or blue), your protocol is strand-specific. Each gene will be colored based on the fact if the transcription is sense or antisense compared to the reference. If your protocol is not strand-specific, you will se mix of both colors.

ADD COMMENTlink written 3.2 years ago by Biomonika (Noolean)3.0k
1
gravatar for Alternative
3.2 years ago by
Alternative220
Alternative220 wrote:

I always like to see how things look like on IGV. I recommend loading the tracks and check as Noolean proposed but additionally, I would take a couple of small transcripts (with few reads) and check if the counts on IGV match with those of HTSeq.

Also, as Antonio said too, the person that generated the library has to give this information.

 

 

 

ADD COMMENTlink written 3.2 years ago by Alternative220
0
gravatar for Antonio R. Franco
3.2 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco3.9k wrote:

One will know when your library has been constructed stranded or not.. You must purchase and use specific reagents and follow a determined protocol

Don't you have that information ?

If you know that your library is stranded, you should give that information to your mapping program and to HTSeq-Count as well

ADD COMMENTlink written 3.2 years ago by Antonio R. Franco3.9k

Dear Noolean/Pierre/Antonio, I received the information now. The RNASeq library has been constructed based on strand specific protocol. 

Sure, I will check the bam file using IGV and validate the counts provided by htseq-count. 

But to get bam files, I used tophat for alignment step and it produced the accepted_hits.bam file. Using the following command for tophat

tophat -p 6 -o $outdir $bowtie_index $fastq_r1 $fastq_r2

I didn't mention any library-type. But I got the information now that my RNAseq is based on strand specific protocol.

Which below option should I need to select? 

I believe, that now I should rerun my tophat with strand information as Antonio mentioned.

Below content from following link (https://ccb.jhu.edu/software/tophat/manual.shtml#toph): 

--library-type The default is unstranded (fr-unstranded). If either fr-firststrand or fr-secondstrand is specified, every read alignment will have an XS attribute tag as explained below. Consider supplying library type options below to select the correct RNA-seq protocol.
Library Type Examples Description
fr-unstranded Standard Illumina Reads from the left-most end of the fragment (in transcript coordinates) map to the transcript strand, and the right-most end maps to the opposite strand.
fr-firststrand dUTP, NSR, NNSR Same as above except we enforce the rule that the right-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during first strand synthesis is sequenced.
fr-secondstrand Ligation, Standard SOLiD Same as above except we enforce the rule that the left-most end of the fragment (in transcript coordinates) is the first sequenced (or only sequenced for single-end reads). Equivalently, it is assumed that only the strand generated during second strand synthesis is sequenced.
ADD REPLYlink written 3.2 years ago by nalandaatmi40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1209 users visited in the last hour