Question: NCBI, Gene, Exon count
0
gravatar for Samira1985
14 months ago by
Samira19850
Iran
Samira19850 wrote:

Hello I want to find the exact number of my genes exons. I have 2 genes: NKX2-5 and GATA4 I know NKX2-5 has 2 exons and GATA4 has 7 exons which 6 of them are in the codding sequence. but these exons number is different with NCBI, Gene, Exon count! please let me know why? for example, for in NCBI, Gene, Exon count, wrote 3 exons for NKX2-5 and 11 exons for GATA4

sequence gene • 588 views
ADD COMMENTlink modified 14 months ago by WouterDeCoster44k • written 14 months ago by Samira19850

Where are you seeing 3 exons for NKX2-5 and 11 exons for GATA4?

I see the following (assume this is for Human gene). [Data truncated to save space]

$ esearch -db gene -query "NKX2-5 [GENE] AND Homo [ORGN]" | efetch -db gene -format gene_table
NKX2-5 NK2 homeobox 5[Homo sapiens]
Gene ID: 1482, updated on 21-Apr-2019


Reference GRCh38.p12 Primary Assembly NC_000005.10  (minus strand) from: 173235321 to: 173232104
mRNA transcript variant 1 NM_004387.3, 2 exons,  total annotated spliced exon length: 1669
protein isoform 1 NP_004378.1 (CCDS4387.1), 2 coding  exons,  annotated AA length: 324

Exon table for  mRNA  NM_004387.3 and protein NP_004378.1
Genomic Interval Exon           Genomic Interval Coding         Gene Interval Exon              Gene Interval Coding            Exon Length     Coding Length   Intron Length
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
173235312-173234750             173235083-173234750             10-572          239-572         563             334             1540
173233209-173232104             173233209-173232569             2113-3218               2113-2753               1106            641



$ esearch -db gene -query "GATA4 [GENE] AND Homo [ORGN]" | efetch -db gene -format gene_table
GATA4 GATA binding protein 4[Homo sapiens]
Gene ID: 2626, updated on 4-May-2019


Reference GRCh38.p12 Primary Assembly NC_000008.11  from: 11676919 to: 11760002
mRNA transcript variant 3 NM_001308094.1, 7 exons,  total annotated spliced exon length: 2656
protein isoform 3 NP_001295023.1 (CCDS78304.1), 5 coding  exons,  annotated AA length: 236

Exon table for  mRNA  NM_001308094.1 and protein NP_001295023.1
Genomic Interval Exon           Genomic Interval Coding         Gene Interval Exon              Gene Interval Coding            Exon Length     Coding Length   Intron Length
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
11676919-11677063               1-145           145             23447
11700511-11700778               23593-23860             268             48137
11748916-11749085               11748921-11749085               71998-72167             72003-72167             170             165             1025
11750111-11750236               11750111-11750236               73193-73318             73193-73318             126             126             4809
11755046-11755133               11755046-11755133               78128-78215             78128-78215             88              88              1801
11756935-11757083               11756935-11757083               80017-80165             80017-80165             149             149             1209
11758293-11760002               11758293-11758475               81375-83084             81375-81557             1710            183
ADD REPLYlink modified 14 months ago • written 14 months ago by genomax85k

please check this link: https://www.ncbi.nlm.nih.gov/gene/1482

ADD REPLYlink written 14 months ago by Samira19850

Most genes have multiple alternative transcripts, of which some can have more exons than others. This is perfectly normal. Unless all transcripts have the same number of exons there is no such thing as "the exact number of exons per gene".

ADD REPLYlink written 14 months ago by WouterDeCoster44k

please check this link:https://www.ncbi.nlm.nih.gov/gene/1482 why exon count is 3 here

ADD REPLYlink written 14 months ago by Samira19850

When you say you "know NKX2-5 has 2 exons and GATA4 has 7 exons", how do you know this? As you point out, NCBI disagrees and says that NKX2-5 has a total of 3 exons with coding sequence, and GATA4 has 11 exons (although they do not all have coding sequence).

As pointed out, genes have different isoforms, so while NKX2-5 has 3 exons, any given isoform only has two of them (in general the isoforms vary in which 3' exon they use).

The situation is more complex for GATA4, which has several isoforms each of which contains some combination of exons (not always the same number).

ADD REPLYlink written 14 months ago by i.sudbery8.2k

please check nkx2-5 refseqgene, it has 2 exons

ADD REPLYlink written 14 months ago by Samira19850

please check this link: https://www.ncbi.nlm.nih.gov/gene/1482

ADD REPLYlink written 14 months ago by Samira19850
1
gravatar for i.sudbery
14 months ago by
i.sudbery8.2k
Sheffield, UK
i.sudbery8.2k wrote:

Refseq is the annotation of transcripts, not genes. The gene NKX2-5 has 4 refseq transcripts associated with it: NM_001166175, NM_001166176, NM_004387, and XM_017009071. Each transcript has 2 exons, but they are not the same exons in each case. If there are three exons, A, B and C, then NM_001166175, NM_001166176 and NM_004387 consist of exon A and various different versions of exon C, while XM_017009071 consists of exon A and exon B.

NKX2-5 structure

ADD COMMENTlink modified 14 months ago • written 14 months ago by i.sudbery8.2k
1

To be explicit. NKX2-5 gene model has a total of 3 exons, where the last exon is spliced three different ways (resulting in first three transcripts). In this particular case resulting transcripts only have 2 exons represented in each case (which is unusual).

$ esearch -db gene -query "GATA4 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
      <Gene-commentary_label>Exon count</Gene-commentary_label>
      <Gene-commentary_text>11</Gene-commentary_text

$ esearch -db gene -query "NKX2-5 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
      <Gene-commentary_label>Exon count</Gene-commentary_label>
      <Gene-commentary_text>3</Gene-commentary_text>
ADD REPLYlink modified 14 months ago • written 14 months ago by genomax85k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1277 users visited in the last hour