This is kind of a general question regarding NCBI accesion numbers.
Suppose I have this sequence
>myseq MGQ-----NSPNLLR------LSQ --TLVGSSLLSSPSSPTTLKVKMPHAFPFLTPDQ-KKELSDIAHKIVAKGKGILAADES- --TGSVAKRFQSINTENTEENRRLYRQLLFTA-DERAGPCIGGVIFFHETLYQKTDAGKT FPEHVKSRGWVVGIKVDKGVVPLAGTN-GETTTQ---GLDGL--------YERCAQYKKD GCDFAKWRCVLKITSTTPSRLAIMENCNVLARYASICQM--HGIVPIVEPEILPDGDHDL KRTQYVTEKV-LAAMYKALSDHHVYLEGTLLKPNMVTAGHSCSHKYTHQDIAMATITALR RTVPPAVPG--ITFLSGGQSEEEASINLNVMNQCPLHRPWAITFSYGRALQASALKAWGG KPGNGKAAQEEFIKRAL------ANSLACQGKYVSSGN-S-A-AAGDSLFVANHAY
I want to blast it (using blastp and nr) onto the salmon database (Salmo salar). I get three roughly equivalent hits corresponding to three different IDs:
I bet that there are not three different genes. Thus, which sequence(s) should I consider as the 'good' one(s)? The more recent? The 'NP' ones? I could not find any info related to the detailed NCBI sequence identification process (but see this). Many thanks for your advice!