Database or Tool: Identifying Gene Regions (Exon/Intron/UTR) from Chromosome Position
1
0
Entering edit mode
6 weeks ago
kelpotus22 • 0

I'm currently using the BioMart database to identify genes and regulatory elements based on chromosome and position coordinates.

However, I'm also interested in determining whether a given position falls within specific regions of a gene, such as exons, introns, or untranslated regions (UTRs). Are there any databases or tools available where I can input chromosome and position coordinates and obtain information about the specific gene region, like exon, intron, or UTR, that the position is located on? Thank you for any suggestions or recommendations!

gene-region biomart ucsc-browser intron exon • 505 views
ADD COMMENT
0
Entering edit mode

how is it different from your previous question ? Tool to Identify Gene, Regulatory Role, and Function at Integration Sites

ADD REPLY
0
Entering edit mode

In addition to identifying positions with regulatory roles (previous post), I also want to determine the specific gene regions (e.g., UTR, exon, intron) associated with the chromosome positions. The previous method only provided information on positions with regulatory roles, yielding 21 outputs out of about 70 chromosome:position inputs. For the remaining positions, I am interested in knowing their location within the gene, and whether they fall within UTRs, exons, or introns.

ADD REPLY
0
Entering edit mode

If you are already using BioMart then you should be able to get this information right in that db.

ADD REPLY
1
Entering edit mode
6 weeks ago

as I said in the previous post, you need to learn how to use tabix.

    wget -O - "https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_45/gencode.v45.annotation.gtf.gz" |\
    gunzip -c |\
    awk -F '\t' '/^#/ {next;} {gene_id="";gene_name;N=split($9,a,/[ ]*[;][ ]*/);for(i=1;i<N;i++) {split(a[i],b,/ /);if(b[1]=="gene_id") gene_id=b[2]; if(b[1]=="gene_name") gene_name=b[2];} OFS="\t" ; print $1,int($4)-1, $5,$3,gene_id,gene_name;}' |\
    tr -d '"' |\
     sort -t $'\t' -k1,1 -k2,2n |\
    bgzip > tmp.tab.bgz


     tabix -p bed -f tmp.tab.bgz


    $ tabix tmp.tab.bgz "chr1:2402292-2402292" 
    chr1  2391774  2405442  gene        ENSG00000157916.20  RER1
    chr1  2391774  2405442  transcript  ENSG00000157916.20  RER1
    chr1  2391827  2402325  transcript  ENSG00000157916.20  RER1
    chr1  2391840  2405436  transcript  ENSG00000157916.20  RER1
    chr1  2391842  2403064  transcript  ENSG00000157916.20  RER1
    chr1  2391889  2403751  transcript  ENSG00000157916.20  RER1
    chr1  2393173  2402316  transcript  ENSG00000157916.20  RER1
    chr1  2394004  2403216  transcript  ENSG00000157916.20  RER1
    chr1  2402096  2402292  CDS         ENSG00000157916.20  RER1
    chr1  2402096  2402342  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402292  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402292  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402316  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402316  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402325  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402342  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402342  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402342  CDS         ENSG00000157916.20  RER1
    chr1  2402206  2402342  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402342  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402342  exon        ENSG00000157916.20  RER1
    chr1  2402206  2402342  exon        ENSG00000157916.20  RER1
ADD COMMENT

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6