Question: How To Find The Strand-Specific Info In The Public Data From The Illumina Human Body Map 2.0 Project
7.7 years ago by
camelbbs670 wrote:


Anyone can guide a while how to find if this public rnaseq data are prepared by strand-specific assay. Thanks a lot!


rna-seq • 4.1k views
ADD COMMENTlink modified 7.7 years ago by matted7.3k • written 7.7 years ago by camelbbs670

I have sent an email to the submitter, hopefully she will respond here.

ADD REPLYlink written 7.7 years ago by Istvan Albert ♦♦ 84k
7.7 years ago by
Boston, United States
matted7.3k wrote:

If you read the PDF at the bottom of the page you linked, you will find the answer to your question (page 15):

"The samples used for the 2X50 and 1X75 bp runs are prepared using the Illumina mRNA-Seq kit.

– They are made with a random priming process and are not stranded."

"The sample prep used for the 1X100 bp run is the Illumina Pre- Released Directional RNA-Seq protocol.

– These data are stranded."

In general, you can usually make a quick guess by loading up aligned reads in some viewer and looking for agreement (or differences) between strands. You could also imagine doing something more sophisticated like looking at a correlation coefficient across binned read counts between the two strands... higher would indicate unstranded and lower would indicate strand-specificity.

ADD COMMENTlink written 7.7 years ago by matted7.3k

Thanks a lot! I am still not very understanding the method you said to check if the seq are stranded or not. But the PDF gives the clear info. That' very helpful. I also have two other datasets from ENCODE project and I can't figure out the stranded info:

ADD REPLYlink written 7.7 years ago by camelbbs670

The mind boggling part is that one needs to go to the 15th page of an associated PDF. Whereas the file itself is hosted on a supposedly science oriented website that is full of seemingly useless information:

Overall design    

Experimental Design: organism_part_comparison_design
Experimental Design: co-expression_design
Experimental Design: optimization_design
Experimental Factor Name: ORGANISMPART
Experimental Factor Name: LIBRARYPREP
Experimental Factor Name: PHENOTYPE
Experimental Factor Type: organism_part
Experimental Factor Type: protocol
Experimental Factor Type: phenotype
ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by Istvan Albert ♦♦ 84k

Thanks. I wonder how to look for the agreement or differences between strands in IGV browser with input bam files.

ADD REPLYlink written 7.7 years ago by camelbbs670
