Entering edit mode
4.0 years ago
pjsinha07
▴
10
Hi All,
I am trying to extract gene list from chromosome number and position, for that I am using biomaRt in R but I am getting error messages as shown below. Also below is the code I am using for extraction.
library("biomaRt")
listMarts()
ensembl <- useMart("ensembl")
datasets <- listDatasets(ensembl)
ensembl = useDataset("rnorvegicus_gene_ensembl",mart=ensembl)
AT_AC_Gene <- read.csv("AT-AC-methylkit_biomart-4-7-21.csv",header=T)
attributes <- c("external_gene_name","ensembl_gene_id","start_position","end_position","rgd_symbol","chromosome_name")
filters <- c("chromosome_name","start","end")
values <- list(AT_AC_Gene$chr,AT_AC_Gene$start,AT_AC_Gene$end)
final_1 <- getBM(attributes=attributes, filters=filters, values=values, mart=ensembl)
Error message: Batch submitting query [==>---------------------------------------------------------------------] 5% eta: 35sError in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery, :
Query ERROR: caught BioMart::Exception::Database: Error during query execution: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'AND (main.seq_region_end_1020 >= '15108600' OR main.seq_region_end_1020 >= '9115' at line 1
Any help will be highly appreciated in resolving this issue.
Thanks in Advance.
I presume this happens at the same point (5%) each time?
My guess is that you have some combination of chromosome, start, & end that is invalid i.e. a negative coordinate, or a cordinate beyond the end of the specified chromosome.
Maybe you can try submitting only a subset of the values to try and narrow down which one is triggering the error.
Hi Mike,
I just shortened my data and ran the code again but this time the error is:
Just to update you that after shortening the data, the code runs well without any error but the final1 output has 0 observations of 6 variables. Why?
If you get 0 observations it's because your query doesn't return anything from the Ensembl BioMart database. Without seeing any example values it's impossible to say why that might be. Perhaps you can share the first few rows of
AT_AC_Gene
?If you're consistently getting errors relating to the format of the data it sounds like you might need to do some data cleaning in case there's something fundamentally incompatible between your input data and Ensembl, like using a different reference genome.
If you're struggling to get results with a with biomaRt query, its also worth visiting https://www.ensembl.org/biomart/martview/ and trying the interactive mode with a small set of values. The text accompanying each filter or attribute can sometimes help understand if you're querying the right things, and identify if it's a code problem or a data problem.
Thank you Mike for your suggestion. The first few row of AT_AC_Gene is below:
head(AT_AC_Gene)
chr start end pvalue qvalue meth.diff
chr1 181943385 181943385 1.30e-217 4.09e-216 -22.99185
chr1 53666662 53666662 2.82e-90 1.85e-89 -21.94640
chr1 56025446 56025446 2.72e-114 2.66e-113 -19.86795
chr1 55686907 55686907 9.08e-81 4.81e-80 -19.76459
chr1 24840721 24840721 7.32e-59 2.39e-58 -19.04278
chr1 180503143 180503143 3.20e-153 5.30e-152 -18.15721