Filtering SNPs by protein
1
0
Entering edit mode
5.5 years ago
jonas • 0

Hello

I'm trying to filter Birdsuite genotype calls (Affymetrix SNP IDs) for a list of 62 proteins (Ensembl peptide IDs) using biomaRt. Since the number of unfiltered genotype calls is large, I cannot query for the genes of each SNP, but have to query for the SNPs of each gene on my list.

Therefore, I first get the list of gene IDs for the ENSP IDs:

ensembl_mart <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl")
ensembl_genes <- biomaRt::getBM(c("ensembl_gene_id", "ensembl_peptide_id"), "ensembl_peptide_id", ensembl_peptide_ids, ensembl_mart)

Then I try to fetch all SNPs (which I later want to match with my genotype calls) according to that list of genes:

snp_mart <- biomaRt::useMart("ENSEMBL_MART_SNP", "hsapiens_snp")
snps <- biomaRt::getBM(c("refsnp_id", "ensembl_gene_stable_id"), "ensembl_gene", ensembl_genes$ensembl_gene_id, snp_mart)

However, the query times out after 5 minutes with the following error message:

Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: Operation timed out after 300001 milliseconds with 380749 bytes received

I tried submitting separate queries for each gene, but for some genes (presumably the ones with many SNPs) I still get a timeout. My internet connection is fine and I can access the Ensembl homepage without problems. Also, I tried different Ensembl hosts at different times of the day.

Does anyone have an idea how to address this problem, or suggestions for alternative approaches?

SNP biomaRt Ensembl • 1.0k views
ADD COMMENT
0
Entering edit mode
5.5 years ago
Emily 23k

BioMart is not designed to get all variants of all genes. It is expected to fail on that call.

What format are your genotype calls in? Could you run them through the VEP then filter the VEP results with your protein IDs?

ADD COMMENT

Login before adding your answer.

Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6