Extracting gene list from chromosome number and location
Entering edit mode
3.3 years ago
pjsinha07 ▴ 10

Hi All,

I am trying to extract gene list from chromosome number and position, for that I am using biomaRt in R but I am getting error messages as shown below. Also below is the code I am using for extraction.


ensembl <- useMart("ensembl")
datasets <- listDatasets(ensembl)
ensembl = useDataset("rnorvegicus_gene_ensembl",mart=ensembl)
AT_AC_Gene <- read.csv("AT-AC-methylkit_biomart-4-7-21.csv",header=T)
attributes <- c("external_gene_name","ensembl_gene_id","start_position","end_position","rgd_symbol","chromosome_name")
filters <- c("chromosome_name","start","end")
values <- list(AT_AC_Gene$chr,AT_AC_Gene$start,AT_AC_Gene$end)
final_1 <- getBM(attributes=attributes, filters=filters, values=values, mart=ensembl)
Error message: Batch submitting query [==>---------------------------------------------------------------------]   5% eta: 35sError in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery,  : 
  Query ERROR: caught BioMart::Exception::Database: Error during query execution: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'AND (main.seq_region_end_1020 >= '15108600' OR main.seq_region_end_1020 >= '9115' at line 1

Any help will be highly appreciated in resolving this issue.

Thanks in Advance.

BiomaRt R • 2.0k views
Entering edit mode

I presume this happens at the same point (5%) each time?

My guess is that you have some combination of chromosome, start, & end that is invalid i.e. a negative coordinate, or a cordinate beyond the end of the specified chromosome.

Maybe you can try submitting only a subset of the values to try and narrow down which one is triggering the error.

Entering edit mode

Hi Mike,

I just shortened my data and ran the code again but this time the error is:

Batch submitting query [===============>--------------------------------------------------------]  22% eta: 11sError in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery,  : 
  Query ERROR: caught BioMart::Exception::Usage: Wrong format value for Start
Entering edit mode

Just to update you that after shortening the data, the code runs well without any error but the final1 output has 0 observations of 6 variables. Why?

Entering edit mode

If you get 0 observations it's because your query doesn't return anything from the Ensembl BioMart database. Without seeing any example values it's impossible to say why that might be. Perhaps you can share the first few rows of AT_AC_Gene?

If you're consistently getting errors relating to the format of the data it sounds like you might need to do some data cleaning in case there's something fundamentally incompatible between your input data and Ensembl, like using a different reference genome.

If you're struggling to get results with a with biomaRt query, its also worth visiting https://www.ensembl.org/biomart/martview/ and trying the interactive mode with a small set of values. The text accompanying each filter or attribute can sometimes help understand if you're querying the right things, and identify if it's a code problem or a data problem.

Entering edit mode

Thank you Mike for your suggestion. The first few row of AT_AC_Gene is below:


chr start end pvalue qvalue meth.diff

chr1 181943385 181943385 1.30e-217 4.09e-216 -22.99185

chr1 53666662 53666662 2.82e-90 1.85e-89 -21.94640

chr1 56025446 56025446 2.72e-114 2.66e-113 -19.86795

chr1 55686907 55686907 9.08e-81 4.81e-80 -19.76459

chr1 24840721 24840721 7.32e-59 2.39e-58 -19.04278

chr1 180503143 180503143 3.20e-153 5.30e-152 -18.15721


Login before adding your answer.

Traffic: 3120 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6