Question: Get Utr5' And Utr3' Data For Genes From Ucsc Genome Browser
1
gravatar for caleb.alleman
7.6 years ago by
caleb.alleman10 wrote:

I'm attempting to create an svg image of various transcripts of genes and I'm using data on human genes from the UCSC Genome browser and I'm running into trouble. I have direct MySql access to the database. I have data on exon start and end base pair positions and transcription start and end base pair positions. I'm looking for data either on the base pair position of UTR5' and UTR3' regions of genes or simply the coding region base pair positions which I can then use to exclude that part from the range of the whole gene and leave the remaining area as UTR5' and UTR3'. Any ideas on where in the hg19 database in UCSC Genome Browser I could find this data?

orf utr ucsc • 3.6k views
ADD COMMENTlink modified 7.6 years ago by Vikas Bansal2.4k • written 7.6 years ago by caleb.alleman10
4
gravatar for Vikas Bansal
7.6 years ago by
Vikas Bansal2.4k
Berlin, Germany
Vikas Bansal2.4k wrote:

I think you can find your solution here. Pierre's answer -

mysql -h genome-mysql.cse.ucsc.edu -u genome -D hg19 -N -A -e 'select distinct chrom,strand, txStart,cdsStart from knownGene where txStart< cdsStart union select distinct chrom,strand,cdsEnd,txEnd from knownGene where cdsEnd< txEnd ' > utrs.txt

EDIT: After OP's comment.

image

ADD COMMENTlink modified 7.6 years ago • written 7.6 years ago by Vikas Bansal2.4k

added 'distinct'

ADD REPLYlink written 7.6 years ago by Pierre Lindenbaum126k

Looking through this knownGene table and many times the cdsStart and cdsEnd are the same value. Which doesn't make any sense. How can I trust this data?

ADD REPLYlink written 7.6 years ago by caleb.alleman10

so to clarify the cds start and end positions contain no parts of the UTR5' or UTR3' regions correct?

ADD REPLYlink written 7.6 years ago by caleb.alleman10

CDS is coding sequence which gets translated to protein. UTR's are untranslated regions. Please have a look at this picture.

ADD REPLYlink written 7.6 years ago by Vikas Bansal2.4k

So if the data I have says that CDS end comes before transcription is ended then something is wrong correct?

ADD REPLYlink written 7.6 years ago by caleb.alleman10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 888 users visited in the last hour