NCBI, Gene, Exon count
1
0
Entering edit mode
5.0 years ago
Samira1985 • 0

Hello I want to find the exact number of my genes exons. I have 2 genes: NKX2-5 and GATA4 I know NKX2-5 has 2 exons and GATA4 has 7 exons which 6 of them are in the codding sequence. but these exons number is different with NCBI, Gene, Exon count! please let me know why? for example, for in NCBI, Gene, Exon count, wrote 3 exons for NKX2-5 and 11 exons for GATA4

gene sequence • 3.4k views
ADD COMMENT
1
Entering edit mode

Where are you seeing 3 exons for NKX2-5 and 11 exons for GATA4?

I see the following (assume this is for Human gene). [Data truncated to save space]

$ esearch -db gene -query "NKX2-5 [GENE] AND Homo [ORGN]" | efetch -db gene -format gene_table
NKX2-5 NK2 homeobox 5[Homo sapiens]
Gene ID: 1482, updated on 21-Apr-2019


Reference GRCh38.p12 Primary Assembly NC_000005.10  (minus strand) from: 173235321 to: 173232104
mRNA transcript variant 1 NM_004387.3, 2 exons,  total annotated spliced exon length: 1669
protein isoform 1 NP_004378.1 (CCDS4387.1), 2 coding  exons,  annotated AA length: 324

Exon table for  mRNA  NM_004387.3 and protein NP_004378.1
Genomic Interval Exon           Genomic Interval Coding         Gene Interval Exon              Gene Interval Coding            Exon Length     Coding Length   Intron Length
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
173235312-173234750             173235083-173234750             10-572          239-572         563             334             1540
173233209-173232104             173233209-173232569             2113-3218               2113-2753               1106            641



$ esearch -db gene -query "GATA4 [GENE] AND Homo [ORGN]" | efetch -db gene -format gene_table
GATA4 GATA binding protein 4[Homo sapiens]
Gene ID: 2626, updated on 4-May-2019


Reference GRCh38.p12 Primary Assembly NC_000008.11  from: 11676919 to: 11760002
mRNA transcript variant 3 NM_001308094.1, 7 exons,  total annotated spliced exon length: 2656
protein isoform 3 NP_001295023.1 (CCDS78304.1), 5 coding  exons,  annotated AA length: 236

Exon table for  mRNA  NM_001308094.1 and protein NP_001295023.1
Genomic Interval Exon           Genomic Interval Coding         Gene Interval Exon              Gene Interval Coding            Exon Length     Coding Length   Intron Length
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
11676919-11677063               1-145           145             23447
11700511-11700778               23593-23860             268             48137
11748916-11749085               11748921-11749085               71998-72167             72003-72167             170             165             1025
11750111-11750236               11750111-11750236               73193-73318             73193-73318             126             126             4809
11755046-11755133               11755046-11755133               78128-78215             78128-78215             88              88              1801
11756935-11757083               11756935-11757083               80017-80165             80017-80165             149             149             1209
11758293-11760002               11758293-11758475               81375-83084             81375-81557             1710            183
ADD REPLY
0
Entering edit mode

please check this link: https://www.ncbi.nlm.nih.gov/gene/1482

ADD REPLY
0
Entering edit mode

Most genes have multiple alternative transcripts, of which some can have more exons than others. This is perfectly normal. Unless all transcripts have the same number of exons there is no such thing as "the exact number of exons per gene".

ADD REPLY
0
Entering edit mode

please check this link:https://www.ncbi.nlm.nih.gov/gene/1482 why exon count is 3 here

ADD REPLY
0
Entering edit mode

When you say you "know NKX2-5 has 2 exons and GATA4 has 7 exons", how do you know this? As you point out, NCBI disagrees and says that NKX2-5 has a total of 3 exons with coding sequence, and GATA4 has 11 exons (although they do not all have coding sequence).

As pointed out, genes have different isoforms, so while NKX2-5 has 3 exons, any given isoform only has two of them (in general the isoforms vary in which 3' exon they use).

The situation is more complex for GATA4, which has several isoforms each of which contains some combination of exons (not always the same number).

ADD REPLY
0
Entering edit mode

please check nkx2-5 refseqgene, it has 2 exons

ADD REPLY
0
Entering edit mode

please check this link: https://www.ncbi.nlm.nih.gov/gene/1482

ADD REPLY
1
Entering edit mode
5.0 years ago

Refseq is the annotation of transcripts, not genes. The gene NKX2-5 has 4 refseq transcripts associated with it: NM_001166175, NM_001166176, NM_004387, and XM_017009071. Each transcript has 2 exons, but they are not the same exons in each case. If there are three exons, A, B and C, then NM_001166175, NM_001166176 and NM_004387 consist of exon A and various different versions of exon C, while XM_017009071 consists of exon A and exon B.

NKX2-5 structure

ADD COMMENT
1
Entering edit mode

To be explicit. NKX2-5 gene model has a total of 3 exons, where the last exon is spliced three different ways (resulting in first three transcripts). In this particular case resulting transcripts only have 2 exons represented in each case (which is unusual).

$ esearch -db gene -query "GATA4 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
      <Gene-commentary_label>Exon count</Gene-commentary_label>
      <Gene-commentary_text>11</Gene-commentary_text

$ esearch -db gene -query "NKX2-5 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
      <Gene-commentary_label>Exon count</Gene-commentary_label>
      <Gene-commentary_text>3</Gene-commentary_text>
ADD REPLY

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6