Hi guys. I'm newbie on bioinformatics and have some questions about blastp using pdb database. While using blastp from ncbi website, I input some sequence as a query and get an output with pdb ID. The thing is I gave a specific protein sequence like 'HSQGTFTSDYSKYLDSRRAQDFVQWLMNT' which corresponds to pdb ID '1GCN' sequence, expecting i'll get the same pdb ID but i cannot find that one in the result list. On the other hand, when i enter the sequence 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN' that is for '1CRN', this one is shown up in the result list. Why is it like that? What am I doing wrong? I can't sleep because I'm so curious about that and I'm so inquisitive. Please help me to sleep. Thank you so much.
One possibility is that the PDB ID has changed. They alter/deprecate them sometimes. '1GCN' may no longer be the current PDB identifier that is returned from NCBI -> PDB API calls.
It's a guess, and a bit of a stretch, but its possible.
What's actually happened is that there are multiple resolved structures which contain this particular amino acid stretch. Why its chosen to prioritise one over the other, I'm not 100% sure. It may have something to do with age or structure quality etc.