geneid.txt and knownGene.txt files in UCSC database to retrieve exon position in non-human species
1
0
Entering edit mode
8.3 years ago
fransua ▴ 390

Hi,

I would like to get the exon position of several non-human species, for this the best option would be to use the knownGene.txt file as mentioned by Pierre Lindenbaum. The problem is that this file is available for human and mouse but not for other species.

The other file that is present in non-human is the geneid.txt however I am a bit confused about its content, it doesn't seem to correspond to genes. Here an example with hg38 doing a head of knownGene.txt:

uc031tla.1    chr1    -    17368    17436    17368    17368    1    17368,    17436,        ENST00000619216.1
uc057aty.1    chr1    +    29553    31097    29553    29553    3    29553,30563,30975,    30039,30667,31097,        ENST00000473358.1
uc057atz.1    chr1    +    30266    31109    30266    30266    2    30266,30975,    30667,31109,        ENST00000469289.1
uc031tlb.1    chr1    +    30365    30503    30365    30365    1    30365,    30503,        ENST00000607096.1
uc001aak.4    chr1    -    34553    36081    34553    34553    3    34553,35276,35720,    35174,35481,36081,        ENST00000417324.1

and of geneid.txt

585    chr1_1.1    chr1    -    16857    35736    16857    35736    7    16857,17232,17605,17914,18267,24737,35720,    17055,17257,17742,18061,18379,24891,35736,    0    chr1_1    incmpl    cmp    0,2,0,0,2,1,0,
73    chr1_2.1    chr1    -    120816    195438    120816    195438    9    120816,129054,164765,185490,187375,187754,188129,188790,195262,    120932,129223,164791,185559,187577,187779,188266,188902,195438,    0    chr1_2    cmpl    cmpl    1,0,1,1,0,2,0,2,0,
586    chr1_3.1    chr1    -    258540    258903    258540    258903    1    258540,    258903,    0    chr1_3    cmpl    cmpl    0,
73    chr1_4.1    chr1    +    353849    393666    353849    393666    3    353849,368835,393552,    354030,370016,393666,    0    chr1_4    cmpl    cmpl    0,1,0,
588    chr1_5.1    chr1    -    450739    485181    450739    485181    2    450739,485039,    451716,485181,    0    chr1_5    cmpl    cmpl    1,0,

Does someone knows what the geneid.txt contains?

And the more important question to me, does someone knows how to retrieve the exon position of non-human species?

thanks

ucsc database • 3.4k views
ADD COMMENT
0
Entering edit mode

Columns five and six show the transcript start and end positions.

Columns seven and eight show the coding region start and end positions.

Column 9 shows the number of exons in transcript.

Column 10 shows the start positions for each of the column 9 number of exons.

Column 11 shows the end positions for each of the column 9 number of exons.

Unlike in BED format I don't believe that the positions are relative to the start of the feature, so I think you are good to go.

https://genome.ucsc.edu/cgi-bin/hgTables

ADD REPLY
2
Entering edit mode
8.3 years ago
stevenlang123 ▴ 210

Columns five and six show the transcript start and end positions.

Columns seven and eight show the coding region start and end positions.

Column 9 shows the number of exons in transcript.

Column 10 shows the start positions for each of the column 9 number of exons.

Column 11 shows the end positions for each of the column 9 number of exons.

Unlike in BED format I don't believe that the positions are relative to the start of the feature, so I think you are good to go.

https://genome.ucsc.edu/cgi-bin/hgTables

ADD COMMENT

Login before adding your answer.

Traffic: 2334 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6