Question: Cross-species RNA-seq analysis in bacteria
gravatar for devikaparvathy
8 months ago by
devikaparvathy10 wrote:


I am working on microbial genomics, and there are a couple of datasets in SRA/ENA that I can use for my work. I want to combine these datasets in a single study but the problem is these datasets are all done on different subspecies of S. aureus.

I tried creating a common reference genome annotation according to a methodology by LoVerso and Cui -, but it mapped only around 200 genes in common between two species.

Is there any other method wherein I can map the homologs of a second genome to the primary reference annotation and carry out an integrated analysis in a single go? (Because that creates many numbers of replicates under the same condition and increases the reliability of the studies)

Or am I supposed to do separate differential expression for each dataset and then compare the obtained genes separately?

ADD COMMENTlink modified 8 months ago • written 8 months ago by devikaparvathy10

Hi, thank you for your reply. But I doubt if I want to create a new consensus reference genome and carry out the analysis - in that case will all the corresponding genes be mapped correctly?

ADD REPLYlink written 8 months ago by devikaparvathy10

My aim is to do an integrative analysis of certain public RNA-seq data available for a particular bacterial species. But each experiment are done in different strains/subspecies.

What I plan to do is to align the reads to their respective reference genomes, and for further analysis, create an annotation file (GFF/GTF) - based on one of the selected subspecies (chosen "target" for lift over) and combine it with the mapped annotation of other subspecies ("source" for lift over).

Is this procedure right? Or are there any other alternatives? I do not wish to do all RNA-seq analysis separately and then simply compare the results of differential expressed gene lists.

ADD REPLYlink written 8 months ago by devikaparvathy10
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe21k
University College London Cancer Institute
Kevin Blighe21k wrote:


I had a similar design in a recent study on a different bacterial species.

One program that works quite efficiently and produces good results is Rockhopper:

Rockhopper will allow you to de novo assemble a consensus genome from whatever data you provide, and it then also performs differential expression analysis. It's output is actually just a FASTA sequence and then expression levels and different statistical parameters. You can then BLASTx these FASTA sequences in order to infer functionality.

Trust that this helps, Kevin

ADD COMMENTlink written 8 months ago by Kevin Blighe21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 930 users visited in the last hour