Duplicate and Different Annotations using biomaRt
1
0
Entering edit mode
4.5 years ago

For the affy HG-U133A_2, I used biomaRt to retrieve the annotation. That being said, I am finding multiple entities for the same probe ID. But only one of these entries is represented in the official annotation file, provided by the vendor. For instance...

200033_at has 4 entries for that particular probe... including protein_coding and miRNA. Additionally, the chromosome_name for a couple of the results include HG183_PATCH (I am assuming this is an old-entry, which has been patched over and replaced by another entry?)

I ended up making a parser for the original annotation file, provided by the vendor, but it seems very strange that biomaRt would retrieve different entries for the exact same probe ID. I was curious if someone knows why this is occurring and anyway to avoid this behavior.

Code:

require("biomaRt")
mart <- useMart("ENSEMBL_MART_ENSEMBL", host="http://grch37.ensembl.org")
mart <- useDataset("hsapiens_gene_ensembl", mart)
x <- listAttributes(mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hg_u133a_2", "ensembl_gene_id", "gene_biotype", "external_gene_name","chromosome_name","start_position", "end_position","strand"), filter="affy_hg_u133_plus_2", values=rownames(exprs(gset)), uniqueRows=TRUE)
biomaRt biobase ensembl annotations • 1.5k views
ADD COMMENT
3
Entering edit mode
4.5 years ago

Hey,

biomaRt just provides for an interface to whatever is stored internally at Ensembl, which can evolve / change over time. For these odd situations, it can help to look up the probe ID at the UCSC Genome Browser:

gg

So, the probe is targeting DDX5, however, there are 2 micro RNAs in the region, too. I am not sure why only 1 of the micro RNAs is output by biomaRt, but that could be a question back to Ensembl via their help desk.

The other chromosome contig that comes up is a GRCh37 patch, which happens to coincide with that region: Human Genome Issue HG-183 (indicated by that red bar on the UCSC viewer)

For annotating these, the manufacturers' annotations are better, in my opinion; however, biomaRt fits neatly into automated workflows.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6