I'm reviewing the UniProt data model and I'm confused by what is a protein. Originally, I thought that a protein was defined by its sequence of amino acids. However, the fact that isoforms of a gene are stored in one entry for UniProt/Swiss-Prot leads me to believe a protein is defined by the gene from which it originates. Otherwise, these alternative splicing would receive distinct UniProt/Swiss-Prot entries. Or perhaps it is more complicated than that and if isoforms are distinctly different enough they receive different entries? Just confused a bit by the definition of protein in this light.
Looking at the word itself, iso- means equal and isoform would seem to mean equal form. Understanding that structure is more conserved than sequences I would understand that form is better to define a protein than sequence. But I don't think this is what isoform really means from looking at the data.
Any help appreciated.