I'm working with rnaseq data of breast cancer samples. There are a total of 40 samples. Among 40 samples, 26 samples are of subtype A and 14 are subtype B.
I did differential analysis with samples between Subtype A and Subtype B with edgeR. Differentially expressed genes are based on FDR < 0.05
The heat map looks like below:
Column annotation colors -
Orange color is Subtype A Darkgreen color is Subtype B
I see that among 26 samples of subtype A, 9 samples are clustered but are away from other 17 samples. You can see that clearly in the heat map.
I also made a MDS plot. In the below MDS plot I made a circle where the 9 samples of Subtype A are close to the samples of Subtype B.
What I should do now if the differential analysis heatmap looks like above? Is removing those 9 samples from the analysis just based on clustering a good idea? If not any suggestions please.