Question: Can NanoString ncounter miRNA data be analyzed with RNA-Seq tools?
0
gravatar for myguestp
5 weeks ago by
myguestp0
myguestp0 wrote:

Hello,

I am trying to analyze miRNA data from the platform NanoString ncounter to identify miRNAs differentially expressed (DE) in human serum samples. I think that the data that comes from this platform is very similar to RNA-seq counts data, because the results are the number of counts in aprox. 800 miRNAs for each sample. So I want to know if I could use RNA-Seq bioinformatics tools to identify the DE miRNAs in three conditions, because I think that some considerations should be taken into account:

Although the panel can detect 800 miRNAs, I have only detected an average of 30 endogenous miRNAs in each serum sample (could be due to the low amounts of miRNAs in the blood), of which most have <100 counts (with a substraction of the mean of the negative controls plus two standard desviations) . Because of that, I think that all the methods that are based on global expression would not be appropiated. Also, at the moment, there isn't a valid endogenous housekeeping miRNA when analyzing circulating miRNAs, so I think that the best would be to normalize data with 5 miRNA Spike-Ins (in different concentrations) that I added to the samples prior to extraction. Or maybe you could suggest me a better way to normalize this data.

So I would like to know if current RNA-Seq tools could be adapted to analyze this kind of data. I would also appreciate your commenting on useful tools that I could use for an exploratory analysis (eg volcano plot, heatmap, etc). I have few experience in Bioinformatics, although I have some notions of programming in R, so I hope that I could learn how to use some tools to study this kind of data. Any kind of tutorials or resources that you recommend me will be very helpful.

Thanks you in advance,

Miguel

rna-seq R • 147 views
ADD COMMENTlink modified 4 weeks ago • written 5 weeks ago by myguestp0

What are the negative controls you mentioned, background noise or actual biological samples?

ADD REPLYlink written 5 weeks ago by Asaf5.3k

The platform is based on hybridization, so they include 8 probes to miRNAs that aren't present in the samples (this is done in each sample). So for each sample, I sustract the mean plus two standard desviations (of the counts of this 8 probes) to the counts of all the miRNAs and considered as detectable all the miRNAs with a positive number of counts.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by myguestp0

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

SUBMIT ANSWER is for new answers to original question.

ADD REPLYlink written 5 weeks ago by genomax64k

Thanks you Asaf.

Regarding to point 4:

  1. Do these packages need that all of the samples has counts in all of the miRNAs to be analyzed? I mean, if for example a miRNA in two o three samples/group don't have counts but all the other yes, the package doesn't analyze this miRNA? This happen in my data (the more expressed miRNAs are present in almos all samples, but as the expression is lower, the number of samples which detected the miRNA is reduced) but I have a relatively big number of samples, over 10 biological replicates in any condition.

On the other hand, I also have some miRNAs that are present in a condition but not in the other, which are of the most interesting I think, do these packages detect when the counts are just present in a condition and absent in the others?

ADD REPLYlink written 5 weeks ago by myguestp0

No. However, you need some that will be present in all for normalization

ADD REPLYlink written 5 weeks ago by Asaf5.3k

I would normalize with the spike-ins, which are present in all samples. Or what do you mean?

ADD REPLYlink written 5 weeks ago by myguestp0
3
gravatar for Asaf
5 weeks ago by
Asaf5.3k
Israel
Asaf5.3k wrote:

Some thoughts:

  1. You could probably use packages like edgeR or DESeq2 the same way they are used with genes, they can work with any counts matrix.
  2. Since the data is sparse and small you might get some intuition from just observing it.
  3. You can definitely use the spike-ins to normalize the counts, you can add them to the counts table and use them in DESeq2 as controlGenes in estimateSizeFactors()
  4. If the miRNAs you see don't overlap between the samples it wouldn't work
ADD COMMENTlink written 5 weeks ago by Asaf5.3k
0
gravatar for myguestp
4 weeks ago by
myguestp0
myguestp0 wrote:

For the people who read this post, I have found a very interesting resource:

https://www.bioconductor.org/help/course-materials/2017/CSAMA/labs/2-tuesday/lab-03-rnaseq/rnaseqGene_CSAMA2017.html#introduction

There is explained how to contruct the variable that DESeq and edgeR needs to work when you need to start from a matrix counts. There is also explained some exploratory analysis that you can apply to your data.

Thanks you Asaf for the help, i am actually trying to learn how to use Rand to apply it to analyze my data.

All kinds of help will be well received

ADD COMMENTlink written 4 weeks ago by myguestp0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2510 users visited in the last hour