Question: How to integrate mirdeep2 result from different samples
18 months ago


Hello, I'm using mirdeep2 to analyze my small RNA seq data (50 bp). I followed their tutorial for the analysis: I analyzed each sample separately and get the result output for each sample.

for f in \*clp.fa;do echo ${f}; ${f} genome_main.fna ${f%_*}_to_genome.arf mature.fasta Ath_miRBase.fasta premature.fasta -z ${f%.*} 2> ${f%_*}_report.log; done

I tried to gather the reads count of each novel and known miRNA together. However, I find it very hard to integrate the results from different samples because seems the predicted novel miRNA is different in different samples. For example, I have 6 samples and most novel miRNA can be detected in 4 samples and but none of them can be detected in sample2 and sample3. There is whole different set of miRNA detected in sample2 and 3, which are not reported in other samples. Therefore, I cannot do the expression analysis using this data. I checked the QC and I don't believe there is issue/contamination in these samples.

here is part of the count summary data: (seems different samples have their own unique set of miRNA)

mirna   S622    S6H2    S6122   S6H3    S623    S6121   S621    S6H1
cccuugucgcuucgauucgu    756891  590559.33   321420  779381.75   522973.75   708258  NA  767231
ath-miR166e-3p  164621.09   97016.81    59226   139494.7    146672.63   160038.63   NA  139948.09
ucaguauccgauaucagcgcaugu    NA  17  NA  NA  NA  NA  NA  25
uccucguguugcaccucu  NA  NA  69844   63423   NA  121998  NA  NA
cguuuggagcagucauaaaug   NA  NA  29871   12421   NA  NA  NA  NA
uaaaagacccguauguauagcaac    NA  NA  NA  120 NA  NA  NA  NA
ath-miR5659 NA  NA  NA  11  NA  NA  NA  NA
aaauccgaggcaccgauuugu   NA  NA  NA  NA  NA  NA  243 NA
guuggacacaccaaacaugcaugu    NA  NA  NA  NA  NA  NA  959 NA
cugaaguugcacucugggacuc  NA  NA  NA  NA  NA  NA  72  NA
gcggcuuggauuggauuugaucgg    NA  NA  NA  NA  NA  NA  27  NA

PS. I already mapped reads to the known miRNA using quantifier module. But my species only have 30 known miRNA, and there is no significant expression difference between samples using total mature reads count. That's why I need to analyze the novel miRNAs derived from different precursor. I calculated the average count as the mature miRNA expression.

