Problem with generting PSSM
0
0
Entering edit mode
2.3 years ago
hasib1624 • 0

I am trying to generate pssm matrix of some protein sequences using psi-blast, using nrdb. But I am having some erroneous format as output in some cases, the problem prevails after multiple run. e.g. For this protein sequence :

MFLKDLGCGTNTCQNLPFDVTPVYSVFDFKGRIKFETDLNAPEVQANLGKVGYLVNEDIEGLIRYDEYYSEEDDLFAYTNGDYPNQFYACQLREHGNMRYKFVRTLGYAFTAITVPENDPIYTIIVYDDGLPEPSTTTLSPSQHVFQTNPIWTILKNSKNGTQGVGAVNLPWTPPVSASPIEPTNVCLLNGNSSVLSRCGEIDRMLQFFDDTHVYIGFKDSIQDASNTYIYSYIGYAFTSSENTCGIPLEPIRELYKYGVGFTSVAGEKYTELISNGYSLTRKVLGYTIDCASATNGELVVTIPDGLSTATPLSSLSTASTPTKTVGFYTRPVQIVSNPSGGLQGYWIEYFDTSSFPRNPLYSTNVCLFLANSSVITRCGHARNLYQYHDTVDNFVFISLYTFGRTVESLPLGVAFETSEKTYYNVVAGDDYQNLLDQGYNITGNIMGYTIDCKDANDEFVYGYLPNSDEFETTTVAGRYGFKVDTVKIVFPNTGSVEYEDNGYWVNTPKRPLVTFVRDTNVCVFREDLETVEKCGATKPLYLYINKNNGGYFMGIQSFGRNLTSTELVGVAFESTENTCGLKLYPMREFYGRPGYDVHAGEDYEEITAAGYNQTGNIMGYTVDCRDAIGGTVYGHLPNTIPEEPTKPSQSTSPSTPSTPPNRIGFYVERVFLVGPKSGIAGQQGYTLKIVSNLDPDTEVRGETNVCVITGNFLEIDTCADFTQTFNSYFDANDNYYYVARDTGRPVTKAGIIGTTFSSQENLCGLNVVPIRELYKEGVGYNAVAGDDYQSLLDEGYTLNGRIMGYTVDCNDAENDFVYGFLPTTTTITKSTVSTTTAPLNRFRLSPVYIIYPTEYYLQNQQEGSHVATRLGFIQHDGAGVTNACVFLSDPSIVALCGTTAPLYLYFDRVRVAYFVGTHSAGRNVSIERAATVFESAGNTCGFTLLPLREFYKDGTGYNLHAGDDYELLVNGGFVATGNIMGYAPDCRDNGFHEPDLKDYVSPTTTTRTPEPTDTSEPGKCNKNLVTLGTTNKNRTFEMNVQYGAVTTLENKRTMTVYCQGTPTYNIYMTWDGNAVSGRNTVSGLVELNLECSLVDGKDNWVTENSNHRVTFVRCDEAYNYLQQ


I am getting a pssm with this type of ending Fault of generated pssm. But to my understanding it should end at line 1133. Then what do the following lines mean? Even values of K and lambda for two cases are different. The link of the whole pssm is hereWhole pssm matrix .

pssm psi-blast • 751 views
0
Entering edit mode

That is not a fasta format file as posted above. You need a separate ID line like

> This_is_fasta


followed by sequence.

> This_is_fasta
MFLKDLGCGTNTCQNLPFDVTPVYSVFDFKGRIKFETDLNAPEVQANLG

0
Entering edit mode

Thanks for pointing this out. The actual file is in fasta format, however, on second thought, just the amino acid sequence matters here, have chosen to rephrase my question.

0
Entering edit mode

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

0
Entering edit mode

Thanks, edited the link

0
Entering edit mode

Please follow all the steps in the how-to. Changing the link alone does not embed the image - you need to

1. Make sure it is added as an image, not as web link
2. Use imgbb, not a file sharing service like Google Drive.
0
Entering edit mode

Second one , the Google drive one, is not an image link, it is a file link, that is why I have used Google drive link. And I think the first link is okay.

0
Entering edit mode

The first is not OK. It still has the problem I mentioned. And please share files using GitHub Gist (or directly paste in biostars) if the file is plain text. If the file is neither an image nor can be represented as plain text, please let us know.