Question: RNASeq reads from two bacterial species
0
gravatar for alyamahmoud
2.3 years ago by
alyamahmoud0 wrote:

Dear All

I am trying to analyze co-transcriptome data from two enteric pathogens. These are new clinical isolates (X and Y). I have RNASeq reads from each species grown individually (X or Y) and from the co-growing culture (X+Y). The pattern of growth observed in in vitro cultures is that X suppresses growth rate of Y and we are trying to have a mechanistic explanation for this pattern.

For this, I used trinity for de novo transcriptome assembly and then RSEM (as an example of alignment based) or Kallisto (as an example of alignment-free). I then ran DESeq2 on the read counts from both RSEM and Kallisto and compared the differentially expressed genes from each case.

I get contradicting results when using both methods: using RSEM: > 2000 genes are sig up regulated in co-growth X+Y culture relative to individually grown X and < 100 down regulated. using Kallisto: > 1500 gene are sig down regulated in co-growth X+Y culture relative to individually grown X and ~300 are up regulated.

Also the skew in the number of DEG towards being up/down regulated is a bit suspicious.

My question is which method should I follow in this case ? What is a good approach for analyzing RNASeq from such an experimental setup ?

Any insights or help will be highly appreciated Thanks

ADD COMMENTlink modified 2.3 years ago by h.mon26k • written 2.3 years ago by alyamahmoud0
2

You do not say how many biological replicates you have per condition. Different (correct) methods will arrive at different results, and even more so if the experimental design is insufficient.

Anyway, read the literature, decide on the method, and go ahead. The danger of trying too many methods is later cherry-picking the one with results suiting your expectations about the outcome.

ADD REPLYlink written 2.3 years ago by h.mon26k

Good advice. Kallisto (and like) has its applications but in case where the reference itself is not very solid it may not be the right tool.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax68k

That's basically why I thought it would be better to rely on de novo transcriptome assembly rather than a fragmented genome assembly (since I also have illumina DNA sequences for the same isolates). Is this assumption true ?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by alyamahmoud0

Also, neither methods seem to make sense at this point since the number of differentially expressed genes is unrealistic.

ADD REPLYlink written 2.3 years ago by alyamahmoud0

If organisms are very similar then this experimental approach is not likely to work in answering the question being posed.

Explanation in this case may turn out to be mundane e.g.. along the lines of organism X simply grows faster than Y in that culture conditions and out-competes it for nutrients. Have you looked at their growth rates independently in the same conditions? Can you provide some additional details about what the organisms in question are and what experimental conditions are being used.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax68k

The organisms are clinical isolates of Vibrio cholera (VCH) and Enterotoxigenic E. coli (ETEC). The growth rates of ETEC is faster than that of VCH when grown on M9+glucose or LB.

ADD REPLYlink written 2.3 years ago by alyamahmoud0

If that is the case then it is going to be difficult to use RNAseq data to find an explanation. Perhaps that is being reflected in the results you are seeing.

Since you have already done the expriment you could try the solution suggested by @h.mon below and see if that produces any useful results.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax68k

Thanks. I have three biological replicates from each culture and co-culture. Unfortunately, most of the literature is about handling co-transcitome from two different domains (eukaryote/prokaryote) which is a bit easier to handle since you can deal with each case as contaminant reads when attempting to quantify the other.

ADD REPLYlink written 2.3 years ago by alyamahmoud0

Have you checked the trinity assembly to see if it looks reasonable? Since you are working with bacteria you don't expect splicing (trinity is designed for eukaryotic transcriptomes). I wonder if you may be better off doing a normal assembly with SPAdes (or rnaSPAdes) instead.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax68k
1
gravatar for h.mon
2.3 years ago by
h.mon26k
Brazil
h.mon26k wrote:

What I would do:

  1. Assemble X and Y genomes separately, using both RNAseq and DNAseq - probably do diginorm first to reduce the RNAseq bias

  2. Anotate using prokka

  3. use bbsplit to find reads that map uniquely to X or to Y in all three RNAseq datasets (X, Y and X+Y)

  4. Use only this subset for the DGE analysis.

Moreover, keep in mind the difference in phenotype may be due to presence / absence of some gene(s), or some genetic variant.

ADD COMMENTlink written 2.3 years ago by h.mon26k

#3 is going to be difficult, especially if the isolates are organisms that are very similar.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax68k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1711 users visited in the last hour