WGCNA dealing with missing data
1
1
Entering edit mode
7.1 years ago

Dear all,

I have run WGCNA and in the first step of it I tried to remove samples with too many missing values using the following code:

gsg = goodSamplesGenesMS(multiExpr, verbose = 3);

gsg$allOK

if (!gsg$allOK) {

Print information about the removed genes:

if (sum(!gsg$goodGenes) > 0)

printFlush(paste("Removing genes:", paste(names(multiExpr[[1]]$data)[!gsg$goodGenes],

collapse = ", ")))

for (set in 1:exprSize$nSets)

{ if (sum(!gsg$goodSamples[[set]]))

printFlush(paste("In set", setLabels[set], "removing samples",

paste(rownames(multiExpr[[set]]$data)[!gsg$goodSamples[[set]]], collapse = ", ")))

Remove the offending genes and samples

multiExpr[[set]]$data = multiExpr[[set]]$data[gsg$goodSamples[[set]], gsg$goodGenes];

}

Update exprSize

exprSize = checkSets(multiExpr) }

I want to know what is the cutoff of WGCNA for missing data to remove a sample?

In the tutorial of WGCNA it has mentioned that samples with too many missing data will be removed, but not mentioned exactly the used cutoff.

I will appreciate any help

Nazanin

WGCNA missing data cutoff • 2.8k views
ADD COMMENT
2
Entering edit mode
7.1 years ago
Jake Warner ▴ 830

Hi. You can designate the cutoffs with the minFraction and or minNsamples arguments. https://www.rdocumentation.org/packages/WGCNA/versions/1.41-1/topics/goodSamplesGenesMS

ADD COMMENT

Login before adding your answer.

Traffic: 2370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6