Question: Bioconductor packages for comparing different species data (particularly RNA-seq and DNA methylation)
gravatar for Saad Khan
5.9 years ago by
Saad Khan400
United States
Saad Khan400 wrote:

Are there any available bioconductor packages for comparing RNA-seq and/or DNA methylation data within two species?


ADD COMMENTlink modified 5.9 years ago by Devon Ryan96k • written 5.9 years ago by Saad Khan400

Define "comparing". I can think of a few different ways of comparing such datasets and it's quite possible that none of them are what you have in mind. Try telling us what your actual biological goal is and then you'll probably get some more useful advice.

ADD REPLYlink written 5.9 years ago by Devon Ryan96k

What I meant to say is comparing orthologus regions with each other. The actual biological goal is to compare a cancer in canines with Humans for a particular tissue and find similar patterns.

What other ways of comparing did you have in mind for going about it?

ADD REPLYlink written 5.9 years ago by Saad Khan400

Without biological context you could have just wanted general comparisons between methylation levels in the promoters of various gene classes and a comparison of tpm distributions (or something similar). That's why we usually ask for the experimental context within which you want to do something. I'll give some actual suggestions in an answer below.

ADD REPLYlink written 5.9 years ago by Devon Ryan96k
gravatar for Devon Ryan
5.9 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

There are a few different things that could be looked at. Firstly, assuming you ran control samples from the dogs in addition to the cancer samples, the first thing to do would be to perform standard differential expression/methylation analysis. For DE, the edgeR, DESeq2 and limma packages are very good and what you'll find everyone recommending. Note that I'm not sure how good the annotations are for the dog genome (I don't work on it), so you might need to use something like RSEM (or trinity followed by RSEM) to get decent metrics, which means you'd be stuck with limma downstream (not that that's a bad thing, limma is an extremely powerful tool). For methylation, it depends on how you generated the data. For RRBS or similar datasets, BiSeq is OK. For methylation arrays, you can use packages like minfi or COHCAP.

One of the interesting things I would do is use GSEA to compare enrichment of groups of differentially expressed/methylated genes between the canine model and patients. You'll obviously need control patient data for this to be worthwhile. If you find any highly relevant pathways (there are a few bioconductor packages for pathway analysis, though I think the Ingenuity Pathway Analysis commercial package is still better in this regard) then I'd pay particular attention to how key players in them are affected in patients.

That's a quick idea and a handful of Bioconductor packages to get you started. I could probably come up with things to look at all day, you have a really target-rich project :)

ADD COMMENTlink written 5.9 years ago by Devon Ryan96k

When people compare methylation in two species they usually use liftover tool to transform one species coordinates to other and then compare. Using that approach I could just do a spearman rank correlation of those DMRs. Is there a better way to do something similar. As suggested below to get Phast conservation scores. How do people generally use the Phast conservation scores?

ADD REPLYlink written 5.9 years ago by Saad Khan400

A rank correlation could work too, though I suspect you'll get more informative results by looking at subsets. This method would also only allow looking at two samples at a time, which will get annoying quickly. The benefit of looking at conservation scores is that changes in highly conserved regions are much more likely to be biologically significant (the Encode consortium got rightly criticized for not doing this).

ADD REPLYlink written 5.9 years ago by Devon Ryan96k

Do you have a paper/link describing the exact procedure as how to go about it?

ADD REPLYlink written 5.9 years ago by Saad Khan400
gravatar for Manvendra Singh
5.9 years ago by
Manvendra Singh2.1k
Berlin, Germany
Manvendra Singh2.1k wrote:

Not Sure about the bioconductor package. But the way, I would do is.

1. I would map RNA-seq reads on the genomes of both species and would fetch common regions where reads are uniquely mapped. If these regions are supported by other reads then they are orthologous regions getting transcribed.

2. To get Phast conservations scores  of DNA methylated regions.

3. Once I have these informations then I can play around on R.



ADD COMMENTlink written 5.9 years ago by Manvendra Singh2.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 944 users visited in the last hour