I have been dealing with an interesting issue regarding 9 RNA-seq green algae samples. These samples come from three conditions, each having 3 biological replicates. To remove rRNA from the samples, rRNA was depleted with varying success in order to keep chloroplast and mitochondrial transcripts.
When estimating the proportion of rRNA per sample, 4 seemed to have a high number of rRNA left over. Varying from 28.5% to 35.5%.
Estimated rRNA proportions per sample:
Chlo_auto_A 28.51% Chlo_auto_B 9.54% Chlo_auto_C 2.73% Chlo_mixo_A 9.02% Chlo_mixo_B 3.56% Chlo_mixo_C 35.51% Chlo_hetero_A 28.69% Chlo_hetero_B 2.43% Chlo_hetero_C 29.86%
Moreover, when performing DE very few genes come up being differentially expressed due to the high heterogeneity between the biological replicates. Because of this I created a PCA plot using CPM values. The samples cluster as expected, except auto_A, hetero_A, hetero_C and mixo_C. Which, prior to cleaning had a high rRNA proportion.
Has anyone experienced anything similar? Or have any idea of how I could use this data to perform a DE analysis?
Any and all advice would be appreciated!