Question

DE on problematic miRNA data set

0

Entering edit mode

7.9 years ago

ofonov ▴ 20

I have a miRNA data set which looks somewhat not normal to me and I wanted to get an opinion of the community.

The data set consist of 90 miRNA samples from cancer tissues. After trimming adapters and filtering by size (18-25 bp) by cutadapt there is very high variation of percentage of filtered reads across the samples: lowest 3.5% and highest 50%, with median 19% (see details down or complete descriptive file here File with statistics). Lowest absolute number of reads after trimming and filtering 343.321, and highest 34.482.005.

I want to do a differential expression analysis of different groups of the samples (across tissues). Are there any potential issues which can arise due to high variability in number of reads across different samples? If so, what can be done about them?

sample #   Total number   Total number of trimmed   Trimmed and filtered   Reads mapped to   Reads mapped to
             of reads       and filtered reads            reads(%)             miRNAs           miRNAs (%) 
3          20264513         1011838                         5.0                855474            84.5
6          22279183         1517941                         6.8                1331150           87.7
11         41346575         12452439                        30.1               10484237          84.2
12         13421631         825000                          6.1                660365            80.0
25         17442351         5609984                         32.2               4629579           82.5
29         22323963         3018897                         13.5               2756814           91.3
32         22097225         1050964                         4.8                887537            84.4
34         32368666         9933623                         30.7               6039261           60.8
55         24319059         5289647                         21.8               4383139           82.9
57         28842256         3291841                         11.4               2850177           86.6
60         15407714         1253426                         8.1                1103150           88.0
61         21409705         9814410                         45.8               8642218           88.1
62         28707347         12635163                        44.0               10764864          85.2
65         21955057         7394967                         33.7               6109353           82.6
66         26624176         11839221                        44.5               10535026          89.0
68         27987570         7319352                         26.2               6290405           85.9
69         9638790          2136508                         22.2               1859750           87.0
82         30422344         3207930                         10.5               2867819           89.4
83         30297304         2402661                         7.9                2137548           89.0
85         41137933         11554100                        28.1               9066972           78.5
87         27224594         8826819                         32.4               7536977           85.4
88         30989860         14273861                        46.1               12847943          90.0
91         50343291         3862888                         7.7                3257365           84.3
92         21730109         1894990                         8.7                1552786           81.9
93         36191161         1431614                         4.0                901351            63.0
94         51992345         1805736                         3.5                1556177           86.2

miRNA differential-expression • 1.1k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 7.9 years ago by ofonov ▴ 20