Hello
I want to find the exact number of my genes exons. I have 2 genes: NKX2-5 and GATA4
I know NKX2-5 has 2 exons and GATA4 has 7 exons which 6 of them are in the codding sequence. but these exons number is different with NCBI, Gene, Exon count! please let me know why?
for example, for in NCBI, Gene, Exon count, wrote 3 exons for NKX2-5 and 11 exons for GATA4
Most genes have multiple alternative transcripts, of which some can have more exons than others. This is perfectly normal. Unless all transcripts have the same number of exons there is no such thing as "the exact number of exons per gene".
When you say you "know NKX2-5 has 2 exons and GATA4 has 7 exons", how do you know this? As you point out, NCBI disagrees and says that NKX2-5 has a total of 3 exons with coding sequence, and GATA4 has 11 exons (although they do not all have coding sequence).
As pointed out, genes have different isoforms, so while NKX2-5 has 3 exons, any given isoform only has two of them (in general the isoforms vary in which 3' exon they use).
The situation is more complex for GATA4, which has several isoforms each of which contains some combination of exons (not always the same number).
Refseq is the annotation of transcripts, not genes. The gene NKX2-5 has 4 refseq transcripts associated with it: NM_001166175, NM_001166176, NM_004387, and XM_017009071. Each transcript has 2 exons, but they are not the same exons in each case. If there are three exons, A, B and C, then NM_001166175, NM_001166176 and NM_004387 consist of exon A and various different versions of exon C, while XM_017009071 consists of exon A and exon B.
To be explicit. NKX2-5 gene model has a total of 3 exons, where the last exon is spliced three different ways (resulting in first three transcripts). In this particular case resulting transcripts only have 2 exons represented in each case (which is unusual).
$ esearch -db gene -query "GATA4 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
<Gene-commentary_label>Exon count</Gene-commentary_label>
<Gene-commentary_text>11</Gene-commentary_text
$ esearch -db gene -query "NKX2-5 [GENE] AND Homo [ORGN]" | efetch -db gene -format xml | grep -A1 Exon
<Gene-commentary_label>Exon count</Gene-commentary_label>
<Gene-commentary_text>3</Gene-commentary_text>
Where are you seeing
3 exons for NKX2-5 and 11 exons for GATA4
?I see the following (assume this is for Human gene). [Data truncated to save space]
please check this link: https://www.ncbi.nlm.nih.gov/gene/1482
Most genes have multiple alternative transcripts, of which some can have more exons than others. This is perfectly normal. Unless all transcripts have the same number of exons there is no such thing as "the exact number of exons per gene".
please check this link:https://www.ncbi.nlm.nih.gov/gene/1482 why exon count is 3 here
When you say you "know NKX2-5 has 2 exons and GATA4 has 7 exons", how do you know this? As you point out, NCBI disagrees and says that NKX2-5 has a total of 3 exons with coding sequence, and GATA4 has 11 exons (although they do not all have coding sequence).
As pointed out, genes have different isoforms, so while NKX2-5 has 3 exons, any given isoform only has two of them (in general the isoforms vary in which 3' exon they use).
The situation is more complex for GATA4, which has several isoforms each of which contains some combination of exons (not always the same number).
please check nkx2-5 refseqgene, it has 2 exons
please check this link: https://www.ncbi.nlm.nih.gov/gene/1482