Question: Goseq: Pwf: Error In If (Hi <= Low) { : Missing Value Where True/False Needed
gravatar for liupfskygre
7.0 years ago by
United States
liupfskygre190 wrote:

Hi, all

Dose anyone familiar with the use of goseq package here, I always get quick reply here, so I post my questions here, hope for some suggestions.

I am trying to use goseq for RNAseq data analysis of a non-native bacteria, I fetch the genelength data and GO term data by using ensembl Perl API. Differential expression was done by DESeq2.

After data run pwf of goseq, error occurred as following:


Error in if (hi <= low) { : missing value where TRUE/FALSE needed

It seems to be the problem of the names or ID problem, but I could not solved it till now.

Have any one met this before? How could I solved this problem?


My workflow were as following

##   import the genes ID which I fetch from Ensembl

HZ254gene_data <- read.delim("E:/HZ_GO_enrichment/HZ254gene_data.txt", quote="")


##  get assayed gene vector




[1] TRUE

head (assayed.genes1)

[1] "Mtc_0001" "Mtc_0002" "Mtc_0003" "Mtc_0004" "Mtc_0005" "Mtc_0006"


chr [1:2512] "Mtc_0001" "Mtc_0002" "Mtc_0003" "Mtc_0004" "Mtc_0005" ...

## get gene length data for pwf



[1] TRUE



Mtc_0001 Mtc_0002 Mtc_0003 Mtc_0004 Mtc_0005 Mtc_0006 
     768      654     1899     1770      351      654


 Named int [1:2512] 768 654 1899 1770 351 654 903 72 966 1071 ...
 - attr(*, "names")= chr [1:2512] "Mtc_0001" "Mtc_0002" "Mtc_0003" "Mtc_0004" ...

##import results from DESeq2 and get the DEgenes 

HZ254DifList_Mer <- read.delim("E:/HZ_GO_enrichment/HZ254DifList_Mer.txt")


##get the genes for pwf 




##import the GO_gene association table from ensembl API 

HZ254gene_go <- read.delim("E:/HZ_GO_enrichment/HZ254gene_go.txt", quote="")


## run pwf, 


##got errors

Error in if (hi <= low) { : missing value where TRUE/FALSE needed


R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936   
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C                                                   
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] goseq_1.14.0            geneLenDataBase_0.99.12 BiasedUrn_1.06.1       
[4] BiocInstaller_1.12.0   

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.24.0   Biobase_2.22.0         BiocGenerics_0.8.0    
 [4] biomaRt_2.18.0         Biostrings_2.30.1      bitops_1.0-6          
 [7] BSgenome_1.30.0        DBI_0.2-7              GenomicFeatures_1.14.2
[10] GenomicRanges_1.14.4   grid_3.0.2             IRanges_1.20.6        
[13] lattice_0.20-24        Matrix_1.1-2           mgcv_1.7-28           
[16] nlme_3.1-113           parallel_3.0.2         RCurl_1.95-4.1        
[19] Rsamtools_1.14.3       RSQLite_0.11.4         rtracklayer_1.22.3    
[22] stats4_3.0.2           tools_3.0.2            XML_3.98-1.1          
[25] XVector_0.2.0          zlibbioc_1.8.0
R format data • 14k views
ADD COMMENTlink modified 3.4 years ago by IrK50 • written 7.0 years ago by liupfskygre190

Hi, liupfskygre, It's hard to know exactly what the issue is without a reproducible example, but perhaps the assignment of values to variable names, which are also base R functions, could be a problem. Try replacing length=HZ254gene_data$length with lgth=HZ254gene_data$length. If the nullp function calls length (the function), your use of this name as a variable might be a conflict.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by KKeenan0270

thanks for your reminding! I recheck the data and found where the problems is, there were some genes in the dataset that do not have expression data, which were assigned as NA!

ADD REPLYlink written 7.0 years ago by liupfskygre190
gravatar for Michael Dondrup
6.9 years ago by
Bergen, Norway
Michael Dondrup48k wrote:

To not leave this question appear unanswered:

The first thing you have do after seeing such an error is to call traceback() and add the output to the error report.

Error in if (hi <= low) { : missing value where TRUE/FALSE needed

means, as it says, there is a missing value where if requires a TRUE/FALSE value, <= and other comparison operators in R yield 'NA' if and only if at least one of the values is missing itself. In conclusion, your input data contains missing values which goseq does not handle.

In case that happens, missing values need to be removed (e.g. by using complete.cases) or imputed, see for a quick tutorial.

ADD COMMENTlink modified 6.9 years ago • written 6.9 years ago by Michael Dondrup48k

Thanks for your explanations!

ADD REPLYlink modified 6.9 years ago • written 6.9 years ago by liupfskygre190
gravatar for Jeremy Leipzig
6.9 years ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

This looks like a logical vector casted into an integer - can you print de.genes1 out

ADD COMMENTlink written 6.9 years ago by Jeremy Leipzig19k

thanks, as for my case, missing data "NA" in the de.genes1 caused the problem! At first I tried to convert de.gene1 into numeric, but failed, recheck the data found NA existed.

ADD REPLYlink written 6.9 years ago by liupfskygre190
gravatar for keith.hughitt
6.7 years ago by
United States
keith.hughitt280 wrote:

In case anyone else runs into this problem, one way you can end up with the same error is if the gene lengths are all the same.

ADD COMMENTlink modified 3.3 years ago • written 6.7 years ago by keith.hughitt280
gravatar for IrK
3.4 years ago by
IrK50 wrote:

I had the same error message and in my case it a binary vector had NAs, so basically:

> genes = as.integer(DESeqoutput$padj <= 0.05 & > !$padj))   # Give all values less than 0.05 and no NAs

However, when I call for nullp() I get following warning:

Can't find mm10/ensGene length data in genLenDataBase...
Found the annotaion package, TxDb.Mmusculus.UCSC.mm10.ensGene
Trying to get the gene lengths from it.
Warning messages:
1: In library() : library ‘/usr/lib/R/site-library’ contains no packages
2: In pcls(G) : initial point very close to some inequality constraints

It is warning message but I would like to interpret it. What does it mean? Thanks

ADD COMMENTlink written 3.4 years ago by IrK50

that R is gibberish. What do you think the gene vector should look like?

ADD REPLYlink written 3.4 years ago by Jeremy Leipzig19k

To be honest, I checked that NAs are absent in gene IDs vector, but forgot that numeric values also can be represented as NAs in DESeqoutput$padj vector.

ADD REPLYlink written 3.4 years ago by IrK50

Show us what genes looks like after your faulty code above.

ADD REPLYlink written 3.4 years ago by Jeremy Leipzig19k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2552 users visited in the last hour