Question: statistical analyses for transcriptomes

0

luzglongoria •

**40**wrote:hi there,

I need to do some statistical analyses of my transcriptomes.

I have a database with 4 columns (gene ID (categoric), expression level (numeric), individual, species). I have 2 different species and 5 ind per species. Per each individual I have more than 20000 genes (some of them are more expressed than others).

What I want to know is whether is there differences between the expression level between species. The distribution of my data doesn't follow a Gaussian distribution.

For analysing my data I run:

```
wilcox.test(Exp~Species, data =data)
```

and then,

```
Wilcoxon rank sum test with continuity correction
data: Exp by Species
W = 8573700000, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
```

According to these result there should be a significan difference in the expression level between species.BUT:

- I am not sure if the analyses are appropiate for this dataset.
- Is there any way where I can take into account (as a random factor) the ID gene?

Thank you so much in advance

ADD COMMENT
• link
•
modified 12 months ago
by
ATpoint ♦

**31k**• written 12 months ago by luzglongoria •**40**
It probably follows the negative binomial. This is normal and expected. Check out the common differential analysis pipelines, such as DESeq2, edgeR or limma/voom. All are well-documented.

31kwhich type of expression data you have (rna-seq, microarray, etc...) ?

8.7kSorry, I didn't say. It is RNA-seq

40I think you are looking for differential expression analysis. Check the Bioconductor 2018 Workshop chapter 6 and 7 for more details.

2.3k