Hello
I am trying to run psiblast with the -in_msa argument supplied, but I continue to get a funny cpp error when it attempts to read my input aln.
The error looks like so: BLAST query error: CAlnReader::GetSeqEntry(): Seq_entry is not available until after Read()
I've thrown lots of things in as the input msa, but nothing seems to work, including the example msa given on many of the BLAST tutorial sites, align1. I'm calling psiblast directly from the command line with the following args:
psiblast -db stable/uniprot_sprot.fasta -evalue 0.00001 -in_msa msa -out psioutput -outfmt 4 -num_descriptions 500 -num_alignments 500 -num_iterations 1
.
I formatted the uniprot database using makeblastdb -in uniprot_sprot.fasta -parse_seqids -blastdb_version 5 -dbtype prot
.
The command runs fine if I use -query queryfile.fasta
with a fasta as input. My attempts at different msa files are both posted below. Can anyone help me run this??
# Example multiple sequence alignment file: attempt 1
align1
------
26SPS9_Hs IHAAEEKDWKTAYSYFYEAFEGYdsidspkaitslkymllckimlntpedvqalvsgkla
F57B9_Ce LHAADEKDFKTAFSYFYEAFEGYdsvdekvsaltalkymllckvmldlpdevnsllsakl
YDL097c_Sc ILHCEDKDYKTAFSYFFESFESYhnltthnsyekacqvlkymllskimlnliddvkniln
YMJ5_Ce LYSAEERDYKTSFSYFYEAFEGFasigdkinatsalkymilckimlneteqlagllaake
FUS6_ARATH KNYIRTRDYCTTTKHIIHMCMNAilvsiemgqfthvtsyvnkaeqnpetlepmvnaklrc
COS41.8_Ci SLDYKLKTYLTIARLYLEDEDPVqaemyinrasllqnetadeqlqihykvcyarvldyrr
644879 KCYSRARDYCTSAKHVINMCLNVikvsvylqnwshvlsyvskaestpeiaeqrgerdsqt
YPR108w_Sc IHCLAVRNFKEAAKLLVDSLATFtsieltsyesiatyasvtglftlertdlkskvidspe
eif-3p110_Hs SKAMKMGDWKTCHSFIINEKMNGkvw----------------------------------
T23D8.4_Ce SKAMLNGDWKKCQDYIVNDKMNQkvw----------------------------------
YD95_Sp IYLMSIRNFSGAADLLLDCMSTFsstellpyydvvryavisgaisldrvdvktkivdspe
KIAA0107_Hs LYCVAIRDFKQAAELFLDTVSTFtsyelmdyktfvtytvyvsmialerpdlrekvikgae
F49C12.8_Hs LYRMSVRDFAGAADLFLEAVPTFgsyelmtyenlilytvitttfaldrpdlrtkvircne
Int-6_Mm KFQYECGNYSGAAEYLYFFRVLVpatdrnalsslwgklaseilmqnwdaamedltrlket
26SPS9_Hs lryagrqtealkcvaqasknrsladfekaltdy---------------------------
F57B9_Ce alkyngsdldamkaiaaaaqkrslkdfqvafgsf--------------------------
YDL097c_Sc akytketyqsrgidamkavaeaynnrslldfntalkqy----------------------
YMJ5_Ce ivayqkspriiairsmadafrkrslkdfvkalaeh-------------------------
FUS6_ARATH asglahlelkkyklaarkfldvnpelgnsyneviapqdiatygglcalasfdrselkqkv
COS41.8_Ci kfleaaqrynelsyksaiheteqtkalekalncailapagqqrsrmlatlfkdercqllp
644879 qailtklkcaaglae1fqy KLFWRAVVAEFLATTLFVFISIGSALGFKYPVGNNQTAVQDNVKVSLAFGLSIATLAQSV
1lda_A -TLKGQCIAEFLGTGLLIFFGVGCVAALKVA--GAS---FGQWEISVIWGLGVAMAIYL-
1fqy GHISG-AHLNPAVTLGLLL-SCQISIFR-ALMYIIAQCVGAIVATAILSGITSS----LT
1lda_A TAGVSGAHLNPAVTIALWLFAC-F-DKRKVIPFIVSQVAGAFCAAALVYGLYYNLFFDFE
1fqy --G--N-----SL-GRN--DLADG-V-NS-GQGLGIEIIGTLQLVLCVLATTDR----RR
laarkykqaakclllasfdhcdfpellspsnvaiygglcalatfd
YPR108w_Sc llslisttaalqsissltislyasdyasyfpyllety-----------------------
eif-3p110_Hs ------------------------------------------------------------
T23D8.4_Ce ------------------------------------------------------------
YD95_Sp vlavlpqnesmssleacinslylcdysgffrtladve-----------------------
KIAA0107_Hs ilevlhslpavrqylfslyecrysvffqslavv---------------------------
F49C12.8_Hs vqeqltggglngtlipvreylesyydchydrffiqlaale--------------------
Int-6_Mm idnnsvssplqslqqrtwlihwslfvffnhpkgrdniidlflyqpqylnaiqtmcphilr
26SPS9_Hs ------------------------------------------------------------
F57B9_Ce ------------------------------------------------------------
YDL097c_Sc ------------------------------------------------------------
YMJ5_Ce ------------------------------------------------------------
FUS6_ARATH idninfrnflelvpdvrelindfyssryascleylasl----------------------
COS41.8_Ci sfgilekmfldriiksdemeefar------------------------------------
644879 rqelqrnvissssfklflelepqvrdiifkfyeskyasclkmldem--------------
YPR108w_Sc ------------------------------------------------------------
eif-3p110_Hs ------------------------------------------------------------
T23D8.4_Ce ------------------------------------------------------------
YD95_Sp ------------------------------------------------------------
KIAA0107_Hs ------------------------------------------------------------
F49C12.8_Hs ------------------------------------------------------------
Int-6_Mm ylttavitnkdvrkrrqvlkdlvkviqqesytykdpitefveclyvnfdfdgaqkklrec
26SPS9_Hs RAELRDDPIISTHLAKLYDNLLEQNLIRVIEPFSRVQIEHISSLIKLSKADVERKLSQMI
F57B9_Ce PQELQMDPVVRKHFHSLSERMLEKDLCRIIEPYSFVQIEHVAQQIGIDRSKVEKKLSQMI
YDL097c_Sc EKELMGDELTRSHFNALYDTLLESNLCKIIEPFECVEISHISKIIGLDTQQVEGKLSQMI
YMJ5_Ce KIELVEDKVVAVHSQNLERNMLEKEISRVIEPYSEIELSYIARVIGMTVPPVERAIARMI
FUS6_ARATH KSNLLLDIHLHDHVDTLYDQIRKKALIQYTLPFVSVDLSRMADAFKTSVSGLEKELEALI
COS41.8_Ci QLMPHQKAITADGSNILHRAVTEHNLLSASKLYNNIRFTELGALLEIPHQMAEKVASQMI
644879 KDNLLLDMYLAPHVRTLYTQIRNRALIQYFSPYVSADMHRMAAAFNTTVAALEDELTQLI
YPR108w_Sc ANVLIPCKYLNRHADFFVREMRRKVYAQLLESYKTLSLKSMASAFGVSVAFLDNDLGKFI
eif-3p110_Hs DLFPEADKVRTMLVRKIQEESLRTYLFTYSSVYDSISMETLSDMFELDLPTVHSIISKMI
T23D8.4_Ce NLFHNAETVKGMVVRRIQEESLRTYLLTYSTVYATVSLKKLADLFELSKKDVHSIISKMI
YD95_Sp VNHLKCDQFLVAHYRYYVREMRRRAYAQLLESYRALSIDSMAASFGVSVDYIDRDLASFI
KIAA0107_Hs EQEMKKDWLFAPHYRYYVREMRIHAYSQLLESYRSLTLGYMAEAFGVGVEFIDQELSRFI
F49C12.8_Hs SERFKFDRYLSPHFNYYSRGMRHRAYEQFLTPYKTVRIDMMAKDFGVSRAFIDRELHRLI
Int-6_Mm ESVLVNDFFLVACLEDFIENARLFIFETFCRIHQCISINMLADKLNMTPEEAERWIVNLI
26SPS9_Hs LDKKFHGILDQGEGVLIIFDEPP
F57B9_Ce LDQKLSGSLDQGEGMLIVFEIAV
YDL097c_Sc LDKIFYGVLDQGNGWLYVYETPN
YMJ5_Ce LDKKLMGSIDQHGDTVVVYPKAD
FUS6_ARATH TDNQIQARIDSHNKILYARHADQ
COS41.8_Ci CESRMKGHIDQIDGIVFFERRET
644879 LEGLISARVDSHSKILYARDVDQ
YPR108w_Sc PNKQLNCVIDRVNGIVETNRPDN
eif-3p110_Hs INEELMASLDQPTQTVVMHRTEP
T23D8.4_Ce IQEELSATLDEPTDCLIMHRVEP
YD95_Sp PDNKLNCVIDRVNGVVFTNRPDE
KIAA0107_Hs AAGRLHCKIDKVNEIVETNRPDS
F49C12.8_Hs ATGQLQCRIDAVNGVIEVNHRDS
Int-6_Mm RNARLDAKIDSKLGHVVMGNNAV
# Second attempt
1fqy KLFWRAVVAEFLATTLFVFISIGSALGFKYPVGNNQTAVQDNVKVSLAFGLSIATLAQSV
1lda_A -TLKGQCIAEFLGTGLLIFFGVGCVAALKVA--GAS---FGQWEISVIWGLGVAMAIYL-
1fqy GHISG-AHLNPAVTLGLLL-SCQISIFR-ALMYIIAQCVGAIVATAILSGITSS----LT
1lda_A TAGVSGAHLNPAVTIALWLFAC-F-DKRKVIPFIVSQVAGAFCAAALVYGLYYNLFFDFE
1fqy --G--N-----SL-GRN--DLADG-V-NS-GQGLGIEIIGTLQLVLCVLATTDR----RR
1lda_A QTHHIVRGSVESVDLAGTFST-YPNPHINFVQAFAVEMVITAILMGLILALTDDGNGVP-
1fqy RDLGGSAPLAIGLSVALGHLLAIDYTGCGINPARSFGSAV-I-----T---HN--F--SN
1lda_A -RG-PLAPLLIGLLIAVIGASMGPLTGFAMNPARDFGPKVFAWLAGWGNVAFTGGRDIPY
1fqy HWIFWVGPFIGGALAVLIYDF-ILA--P
1lda_A FLVPLFGPIVGAIVGAFAYRKLIGRHL-