Hi all,

I have a data.frame from a rna-seq experiment, and I would like to remove some outliers. The data is huge with 350 samples and 32291 genes. The data are log2 RPKM values (I did the log2 because I am planning to do `WGCNA`

analysis and the authors recommend to make a log2 transformation of the data).

I am using the `PcaHubert`

function from `rrcov`

package to find outliers, here is the code I am using:

```
df <- read.table("/path/to/file/rpkm.txt")
dim(df) #32291 352
df <- df[,-c(1,2)] # first 2 columns have accessory data
library(rrcov)
pcaHub <- PcaHubert(t(df))
outliers <- which(pcaHub@flag=='FALSE')
```

The outliers would be those samples with the flag `FALSE`

after doing the RobustPCA, do you think it is appropriate to remove outliers using this method?

Any comments would be greatly appreciated

Thanks

Hello There is this parameter "crit.pca.distances" in function PcaHubert what should be the value for this other than default value. And what is this parameter?