Question: Difference Between Biomart Query And The Ensembl Database
gravatar for Nicolas Rosewick
9.2 years ago by
Belgium, Brussels
Nicolas Rosewick9.3k wrote:


I'm using the biomaRt R packages (bioconductor) to retrieve the 3'UTR sequences of a list of genes. I've the entrezgene Id for each of them and I've differences between the biomaRt result and the ensembl biomart DB.

Here's an example :

For the ENSBTAT00000014489 transcripts (I'm working with Bos Taurus sequences)

In R :

ensembl <- useMart("ensembl")
ensembl <- useDataset("btaurus_gene_ensembl",mart=ensembl)   
                                     3utr entrezgene
1 No UTR is annotated for this transcript     522265

In biomart ensembl : In "export Data", only check 3'UTR :

result :;flank3_display=0;flank5_display=0;g=ENSBTAG00000010909;output=fasta;r=16:73764976-73768065;strand=feature;t=ENSBTAT00000014489;param=utr3;genomic=unmasked;_format=HTML

So, where's the problem ? How can the biomaRt package not retrieve this sequence ?

Thanks a lot,


ensembl biomart utr R • 3.8k views
ADD COMMENTlink written 9.2 years ago by Nicolas Rosewick9.3k
gravatar for Neilfws
9.2 years ago by
Sydney, Australia
Neilfws49k wrote:

The link to Ensembl in your question does not display a 3'-UTR. At first glance, it seems to display the full coding sequence for the transcript - note that it begins with ATG and ends with TGA.

When I use web BioMart, I get the exact same result as when using R biomaRt (see screenshot below):


BioMart via the web should always give the same result as via R, since they connect to the same data source. If there are discrepancies, it's generally because the data you have is not what you thought it was.

ADD COMMENTlink written 9.2 years ago by Neilfws49k

ok thanks ! so biomaRt works great :)

ADD REPLYlink written 9.2 years ago by Nicolas Rosewick9.3k

FYI, this is the best browser page to check whether a transcript contains a UTR or not:;g=ENSBTAG00000010909;r=16:73764976-73768065;t=ENSBTAT00000014489

On this page the CDS is in black, UTRs in purple, introns in blue and flanking sequence in green. So, indeed this transcripts has no UTRs annotated.

ADD REPLYlink modified 16 months ago by _r_am32k • written 9.1 years ago by Bert Overduin3.7k
gravatar for Andeyatz
9.2 years ago by
Andeyatz70 wrote:


I think there may be a difference between explicitly annotated UTR regions and a region which is 3' of a transcript. If you look at this page;db=core;flank3_display=0;flank5_display=0;g=ENSBTAG00000010909;genomic=unmasked;output=fasta;param=utr3;r=16:73764976-73768065;strand=feature;t=ENSBTAT00000014489 then you can see there is no annotated UTR.

The following query on the cow 64 database will show the same result

select t.seq_region_start, t.seq_region_end, e.seq_region_start as exon_start, e.seq_region_end as exon_end, et.rank
from transcript_stable_id 
join transcript t using (transcript_id)
join exon_transcript et using (transcript_id)
join exon e using (exon_id)
where stable_id = 'ENSBTAT00000014489';

Hope this helps

ADD COMMENTlink modified 16 months ago by _r_am32k • written 9.2 years ago by Andeyatz70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1306 users visited in the last hour