Advice on DE analysis of mouse and human merged RNA-seq data
3
0
Entering edit mode
2.8 years ago
Adrian Pelin ★ 2.5k

Hey everyone,

I wanted some advice on the way I am planning on analyzing my samples in terms of if it's done properly and if statistically sound.

My biological question is: What is common between human and mouse cancer cell lines in how they deal with Vaccinia infection in terms of transcriptional changes?

If I was interested in a human cancer cell specific response, I could use any number of tools such as cuffdiff or ballgown, but I want what is common and that's difficult because both mouse and human species have different genomes and annotations. I could do separate comparisons, human uninfected vs infected and mouse infected vs infected and then take what's common, but I want to try and do one comparison.

My experimental samples is 5 different human cancer cell lines, uninfected and infected with virus, no replicates, 10 samples total. I also have 5 different mouse cancer cell lines, uninfected and infected with virus, no replicates, 10 samples. I did not do replicates per condition as I am not interested in cell lines specific differences, rather I want an overall response to viral infection common for all cell line models.

I am planning on: 1. Map human samples to human genome, mouse samples to mouse genome and get bam files. 2. Use featureCounts on both human and mouse samples to get read counts mapping per protein coding gene for both species. 3. Merge human and mouse counts and only keep genes that have the same name between mouse and human species. 4. Feed those counts to DESeq2 and do my comparison with that package, all Uninfected vs all Infected.

Thoughts?

RNA-Seq DEseq Human Mouse • 993 views
ADD COMMENT
1
Entering edit mode
2.7 years ago

(1.) and (2.) look reasonable to me. I have some suggestions concerning the next steps though:

  1. Merge human and mouse counts and only keep genes that have the same name between mouse and human species

Clear homologs sometimes have different names, so you might miss a lot of data doing it that way. As an alternative, there are homology tables (for instance here) that you could use to match mouse and human genes.

  1. Feed those counts to DESeq2 and do my comparison with that package, all Uninfected vs all Infected.

Another possible issue with your method is that if you keep only the genes that are common between mouse and human, DESeq2 might not have enough data to properly normalize and estimate the dispersion of your conditions. Instead, I would run DESEq2 separately on the human and mouse full datasets (considering all genes). Only then, I would filter for the genes with homology between mouse and human and assess if the response is similar between the two species.

ADD COMMENT
1
Entering edit mode
2.7 years ago

The other way to do this is to stay at a higher level; do pathway analysis, look at pathway enrichment of DE genes; those pathways should be pretty comparable between species.

ADD COMMENT
0
Entering edit mode
2.7 years ago
Asaf 8.9k

In addition to what Carlo wrote, I doubt DESeq2 would be able to normalize the samples correctly but if it can it will be great. I actually think it will have trouble with normalizing the five different cell lines. If you do end up using all the samples together I would generate a list of genes which should have the same expression data and use them for normalization. In addition, don't forget to add a covariate to the formula in DESeq2 for mouse/human.

Oh, and RSEM or similar instead of featureCounts, it's more accurate.

ADD COMMENT

Login before adding your answer.

Traffic: 2230 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6