Question

Question about blastp using pdb database

0

Entering edit mode

6.1 years ago

hyunilstaleykang • 0

Hi guys. I'm newbie on bioinformatics and have some questions about blastp using pdb database. While using blastp from ncbi website, I input some sequence as a query and get an output with pdb ID. The thing is I gave a specific protein sequence like 'HSQGTFTSDYSKYLDSRRAQDFVQWLMNT' which corresponds to pdb ID '1GCN' sequence, expecting i'll get the same pdb ID but i cannot find that one in the result list. On the other hand, when i enter the sequence 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN' that is for '1CRN', this one is shown up in the result list. Why is it like that? What am I doing wrong? I can't sleep because I'm so curious about that and I'm so inquisitive. Please help me to sleep. Thank you so much.

blast pdb glucagon 1gcn • 2.1k views

ADD COMMENT • link updated 6.1 years ago by Andrzej Zielezinski 11k • written 6.1 years ago by hyunilstaleykang • 0

0

Entering edit mode

For some reason, record 1GCN was removed/suppressed from NCBI.

enter image description here

ADD REPLY • link 6.1 years ago by Andrzej Zielezinski 11k

0

Entering edit mode

But it did show up in the blast search above.

ADD REPLY • link 6.1 years ago by GenoMax 141k

0

Entering edit mode

A, you're right. I missed jrj.healey's answer. I've just edited my post and moved it as a comment.

ADD REPLY • link 6.1 years ago by Andrzej Zielezinski 11k

score 2 · Answer 1 · 2018-03-13

2

Entering edit mode

6.1 years ago

Joe 21k

One possibility is that the PDB ID has changed. They alter/deprecate them sometimes. '1GCN' may no longer be the current PDB identifier that is returned from NCBI -> PDB API calls.

It's a guess, and a bit of a stretch, but its possible.

EDIT:

What's actually happened is that there are multiple resolved structures which contain this particular amino acid stretch. Why its chosen to prioritise one over the other, I'm not 100% sure. It may have something to do with age or structure quality etc.

enter image description here

ADD COMMENT • link 6.1 years ago by Joe 21k

0

Entering edit mode

I don't think they are alternate titles. They are separate entries referring to the same sequence. 1GCN_A was submitted back in 1977. Others are newer entries.

ADD REPLY • link 6.1 years ago by GenoMax 141k

0

Entering edit mode

Ah yes, right you are. They are separately resolved PDB structures with identical stretches of sequence.

ADD REPLY • link 6.1 years ago by Joe 21k