I have made modeling and docking studies on a protein. According to entires in UniProt, the sequence of the protein is complete. However, I have a feeling that about 20-30 residues at its N-terminal region are missing. Because, its template as well as the structures that belong to the same family have the corresponding region and that portion seems to be catalytically important too (In other structures of this family, the residues play important role in fixing the ligand in right orientation). I wanted to verify it so I did blast search against nr database. And I got to know that there are no any sequences matching with the N-terminal most region of my protein. For example all of the homologous proteins start from 20-35 residues. In this case, can I propose that the protein sequence is incomplete?
Does your sequence come from SwissProt or TrEMBL?
It came from TrEMBL (Q56917)
The background of Michael's question almost certainly is that UniProt contains both well curated proteins (from SwissProt and PIR) and automatically translated nucleotide sequences (from trEMBL). The latter are much more likely to contain errors.
It came from SwissProt.
Yes, sorry, forgot to follow up on this.