is there a soln to Error in scan file = file, what = what, sep = sep, quote = quote, dec = dec, line 1 did not have 3 elements
1
0
Entering edit mode
3.9 years ago
peter.berry5 ▴ 60

The dataframe v8 is created from a progenesis analysis and read into R as a .csv file. Then

  v8$HGNC_Symbol = trimws(v8$HGNC_Symbol)
    v8<- v8[1:5,3:8]
    str(v8)
    # 'data.frame': 5 obs. of  6 variables:
    #  $ v5.max_fold_change       : num  2.15 2.33 5.03 1.75 2.37
    #  $ v5.highest_mean_condition: Factor w/ 5 levels "","---","0% Suspension",..: 3 3 4 3 3
    #  $ v5.lowest_mean_condition : Factor w/ 5 levels "","---","0% Suspension",..: 4 4 3 4 4
    #  $ gene                     : chr  "gene" "gene" "gene" "gene" ...
    #  $ HGNC_Symbol              : chr  "COA3" "LPL" "RDX" "IDH1" ...
    #  $ Refseq                   : chr  "XP_007618708" "XP_007607328" "XP_007616235" "XP_007638219" ...

        dput(v8)

            #  structure(list(v5.max_fold_change = c(2.14, 2.33, 5.02, 1.74, 2.36), 
            # v5.highest_mean_condition = structure(c(3L, 3L, 4L, 3L, 3L), 
    .Label = c("", #"---", "0% Suspension","5% Suspension","Highest mean condition"), class = #"factor"), 
            # v5.lowest_mean_condition = structure(c(4L, 4L, 3L, 4L, 4L), 
    .Label = c("", #"---", "0% Suspension", "5% Suspension", "Lowest mean condition"), class = #"factor"), 
            # gene = c("gene", "gene", "gene", "gene", "gene"), 
            # HGNC_Symbol = c("COA3", "LPL", "RDX", "IDH1", "HSPG2"), 
            # Refseq = c("XP_007618708", "XP_007607328", "XP_007616235", "XP_007638219", #"XP_007617255")), 
            # row.names = c(NA, 5L), class = "data.frame")

     library("biomaRt")
        mart_1<-useMart (biomart="ensembl", dataset="cgcrigri_gene_ensembl")

        go <- getBM(attributes=c("go_id", "external_gene_name", "namespace_1003"), 
                    filters = 'external_gene_name',
                    values = v8$HGNC_Symbol,
                    bmHeader = T, 
                    mart=mart_1)

Results in:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 3 elements"

All my research indicates that there's an issue in the first line of the v8 file but I've checked and there's no missing entry or special characters in that line. Indeed there's no missing entry anywhere in the v8 file.

I even tried creating a new v8 file with just 3 columns and got the same error.

Any advice/suggestions would be great.

R biomaRt Gene Ontology • 4.6k views
ADD COMMENT
0
Entering edit mode

@ MikeSmith, Thanks for that help. I also found this post getBM error while using bioMart which gave some more background to the BiomaRt issues and suggested installing the latest version of BiomaRt. I did this and am happy to report the issue is fixed.

ADD REPLY
0
Entering edit mode

I just started to get the exact same error with getLDS of Biomart with R version 4.0.3 (Updated today) and the latest Bioconductor version 3.11. Googled it and arrived at this post of yours. Apparently, the server issue still seems to be present. I am trying the development version right now. Hopefully, it fixes the error.

ADD REPLY
0
Entering edit mode

You've posted in two threads with different reports, but not included the code you're running or exactly which error you're encountering.

Is it:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 3 elements"

or

Error: failed to load external entity "http://www.ensembl.org/info/website/archives/index.html?redirect=no"
ADD REPLY
1
Entering edit mode

It was

"Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 3 elements"

It was fixed after I restarted R session. Thank you for following up.

ADD REPLY
2
Entering edit mode
3.9 years ago
Mike Smith ★ 2.1k

With the latest version of R and biomaRt I'm unable to reproduce this at the moment (see the example code below).

The error almost certainly doesn't related to you input file, but rather to whatever is being returned by the Ensembl BioMart server. This is normally a text file, which biomaRt reads back in. It will throw this error if you've asked for 3 columns of data in your attributes argument, but whatever gets sent back doesn't actually have 3 columns. There's very little you can do, either the server needs to be 'fixed' (or return to how it was before) or biomaRt patched to work with some new server setting.

Currently (Mon Oct 12 08:48:39 CEST 2020) the Ensembl site has the following message, suggesting there's some issues their side.

This site is currently in maintenance mode and has limited functionality. We apologise for any inconvenience this may cause. We are working to restore full functionality.`

Since it's working for me this was hopefully some transient thing that has passed now.

library("biomaRt")

HGNC_Symbol <- c("COA3", "LPL")

mart_1 <- useMart (biomart="ensembl", dataset="cgcrigri_gene_ensembl")

go <- getBM(attributes=c("go_id", "external_gene_name", "namespace_1003"), 
            filters = 'external_gene_name',
            values = HGNC_Symbol,
            bmHeader = TRUE, 
            mart = mart_1)

go
#>    GO term accession Gene name          GO domain
#> 1         GO:0004465       Lpl molecular_function
#> 2         GO:0006631       Lpl biological_process
#> 3         GO:0005509       Lpl molecular_function
#> 4         GO:0005615       Lpl cellular_component
#> 5         GO:0042803       Lpl molecular_function
#> 6         GO:0009986       Lpl cellular_component

sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Linux Mint 19
#> 
#> Matrix products: default
#> BLAS:   /home/msmith/Applications/R/R-4.0.2/lib/libRblas.so
#> LAPACK: /home/msmith/Applications/R/R-4.0.2/lib/libRlapack.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] biomaRt_2.45.6
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.5           pillar_1.4.6         dbplyr_1.4.4        
#>  [4] compiler_4.0.2       highr_0.8            prettyunits_1.1.1   
#>  [7] tools_4.0.2          progress_1.2.2       digest_0.6.25       
#> [10] bit_4.0.4            lifecycle_0.2.0      tibble_3.0.3        
#> [13] BiocFileCache_1.13.1 RSQLite_2.2.1        evaluate_0.14       
#> [16] memoise_1.1.0        pkgconfig_2.0.3      rlang_0.4.7         
#> [19] DBI_1.1.0            curl_4.3             yaml_2.2.1          
#> [22] parallel_4.0.2       xfun_0.18            dplyr_1.0.2         
#> [25] stringr_1.4.0        httr_1.4.2           knitr_1.30          
#> [28] xml2_1.3.2           rappdirs_0.3.1       generics_0.0.2      
#> [31] vctrs_0.3.4          S4Vectors_0.27.13    askpass_1.1         
#> [34] IRanges_2.23.10      hms_0.5.3            tidyselect_1.1.0    
#> [37] stats4_4.0.2         bit64_4.0.5          glue_1.4.2          
#> [40] Biobase_2.49.1       R6_2.4.1             AnnotationDbi_1.51.3
#> [43] XML_3.99-0.5         rmarkdown_2.4        purrr_0.3.4         
#> [46] blob_1.2.1           magrittr_1.5         ellipsis_0.3.1      
#> [49] htmltools_0.5.0      BiocGenerics_0.35.4  assertthat_0.2.1    
#> [52] stringi_1.5.3        openssl_1.4.3        crayon_1.3.4
ADD COMMENT

Login before adding your answer.

Traffic: 2933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6