Question: Difference Between Biomart Query And The Ensembl Database
1
gravatar for Nicolas Rosewick
7.4 years ago by
Belgium, Brussels
Nicolas Rosewick7.5k wrote:

Hi,

I'm using the biomaRt R packages (bioconductor) to retrieve the 3'UTR sequences of a list of genes. I've the entrezgene Id for each of them and I've differences between the biomaRt result and the ensembl biomart DB.

Here's an example :

For the ENSBTAT00000014489 transcripts (I'm working with Bos Taurus sequences)

In R :

library("biomaRt")
ensembl <- useMart("ensembl")
ensembl <- useDataset("btaurus_gene_ensembl",mart=ensembl)   
getSequence(seqType="3utr",mart=ensembl,type="entrezgene",id=522265)
                                     3utr entrezgene
1 No UTR is annotated for this transcript     522265

In biomart ensembl : In "export Data", only check 3'UTR :

result :

http://www.ensembl.org/Bos_taurus/Export/Output/Transcript?db=core;flank3_display=0;flank5_display=0;g=ENSBTAG00000010909;output=fasta;r=16:73764976-73768065;strand=feature;t=ENSBTAT00000014489;param=utr3;genomic=unmasked;_format=HTML

So, where's the problem ? How can the biomaRt package not retrieve this sequence ?

Thanks a lot,

N.

ensembl biomart utr R • 3.0k views
ADD COMMENTlink written 7.4 years ago by Nicolas Rosewick7.5k
2
gravatar for Neilfws
7.4 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

The link to Ensembl in your question does not display a 3'-UTR. At first glance, it seems to display the full coding sequence for the transcript - note that it begins with ATG and ends with TGA.

When I use web BioMart, I get the exact same result as when using R biomaRt (see screenshot below):

biomart.png

BioMart via the web should always give the same result as via R, since they connect to the same data source. If there are discrepancies, it's generally because the data you have is not what you thought it was.

ADD COMMENTlink written 7.4 years ago by Neilfws48k

ok thanks ! so biomaRt works great :)

ADD REPLYlink written 7.4 years ago by Nicolas Rosewick7.5k

FYI, this is the best browser page to check whether a transcript contains a UTR or not:

http://www.ensembl.org/Bos_taurus/Transcript/Exons?db=core;g=ENSBTAG00000010909;r=16:73764976-73768065;t=ENSBTAT00000014489

On this page the CDS is in black, UTRs in purple, introns in blue and flanking sequence in green. So, indeed this transcripts has no UTRs annotated.

ADD REPLYlink written 7.4 years ago by Bert Overduin3.6k
1
gravatar for Andeyatz
7.4 years ago by
Andeyatz70
Andeyatz70 wrote:

Hi,

I think there may be a difference between explicitly annotated UTR regions and a region which is 3' of a transcript. If you look at this page http://www.ensembl.org/Bos_taurus/Transcript/Sequence_cDNA?_format=HTML;db=core;flank3_display=0;flank5_display=0;g=ENSBTAG00000010909;genomic=unmasked;output=fasta;param=utr3;r=16:73764976-73768065;strand=feature;t=ENSBTAT00000014489 then you can see there is no annotated UTR.

The following query on the cow 64 database will show the same result

select t.seq_region_start, t.seq_region_end, e.seq_region_start as exon_start, e.seq_region_end as exon_end, et.rank
from transcript_stable_id 
join transcript t using (transcript_id)
join exon_transcript et using (transcript_id)
join exon e using (exon_id)
where stable_id = 'ENSBTAT00000014489';

Hope this helps

ADD COMMENTlink written 7.4 years ago by Andeyatz70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 647 users visited in the last hour