I'm trying to do a box plot for gene expression from BulkRNAseq. This is the pipeline I followed: STAR->stringtie. The raw counts are normalized by DESEQ2 (disease vs controls). The normalized counts are used to plot the graph.
I'm plotting a gene expression of a gene A (disease and controls). There are 150samples in disease and 30samples in controls, the normalized counts vary from 0 to 3000 in disease, and 0 to 30 in controls.
The distribution is not normal (there are a lot of samples showing ranges around 0 to 10 and very less samples show ranges above 10). How can I make a better box plot?(What can be considered as an outlier)