Problem with generting PSSM
0
0
Entering edit mode
14 months ago
hasib1624 • 0

I am trying to generate pssm matrix of some protein sequences using psi-blast, using nrdb. But I am having some erroneous format as output in some cases, the problem prevails after multiple run. e.g. For this protein sequence :

MFLKDLGCGTNTCQNLPFDVTPVYSVFDFKGRIKFETDLNAPEVQANLGKVGYLVNEDIEGLIRYDEYYSEEDDLFAYTNGDYPNQFYACQLREHGNMRYKFVRTLGYAFTAITVPENDPIYTIIVYDDGLPEPSTTTLSPSQHVFQTNPIWTILKNSKNGTQGVGAVNLPWTPPVSASPIEPTNVCLLNGNSSVLSRCGEIDRMLQFFDDTHVYIGFKDSIQDASNTYIYSYIGYAFTSSENTCGIPLEPIRELYKYGVGFTSVAGEKYTELISNGYSLTRKVLGYTIDCASATNGELVVTIPDGLSTATPLSSLSTASTPTKTVGFYTRPVQIVSNPSGGLQGYWIEYFDTSSFPRNPLYSTNVCLFLANSSVITRCGHARNLYQYHDTVDNFVFISLYTFGRTVESLPLGVAFETSEKTYYNVVAGDDYQNLLDQGYNITGNIMGYTIDCKDANDEFVYGYLPNSDEFETTTVAGRYGFKVDTVKIVFPNTGSVEYEDNGYWVNTPKRPLVTFVRDTNVCVFREDLETVEKCGATKPLYLYINKNNGGYFMGIQSFGRNLTSTELVGVAFESTENTCGLKLYPMREFYGRPGYDVHAGEDYEEITAAGYNQTGNIMGYTVDCRDAIGGTVYGHLPNTIPEEPTKPSQSTSPSTPSTPPNRIGFYVERVFLVGPKSGIAGQQGYTLKIVSNLDPDTEVRGETNVCVITGNFLEIDTCADFTQTFNSYFDANDNYYYVARDTGRPVTKAGIIGTTFSSQENLCGLNVVPIRELYKEGVGYNAVAGDDYQSLLDEGYTLNGRIMGYTVDCNDAENDFVYGFLPTTTTITKSTVSTTTAPLNRFRLSPVYIIYPTEYYLQNQQEGSHVATRLGFIQHDGAGVTNACVFLSDPSIVALCGTTAPLYLYFDRVRVAYFVGTHSAGRNVSIERAATVFESAGNTCGFTLLPLREFYKDGTGYNLHAGDDYELLVNGGFVATGNIMGYAPDCRDNGFHEPDLKDYVSPTTTTRTPEPTDTSEPGKCNKNLVTLGTTNKNRTFEMNVQYGAVTTLENKRTMTVYCQGTPTYNIYMTWDGNAVSGRNTVSGLVELNLECSLVDGKDNWVTENSNHRVTFVRCDEAYNYLQQ

I am getting a pssm with this type of ending Fault of generated pssm. But to my understanding it should end at line 1133. Then what do the following lines mean? Even values of K and lambda for two cases are different. The link of the whole pssm is hereWhole pssm matrix .

pssm psi-blast • 490 views
ADD COMMENT
0
Entering edit mode

That is not a fasta format file as posted above. You need a separate ID line like

> This_is_fasta

followed by sequence.

> This_is_fasta
MFLKDLGCGTNTCQNLPFDVTPVYSVFDFKGRIKFETDLNAPEVQANLG
ADD REPLY
0
Entering edit mode

Thanks for pointing this out. The actual file is in fasta format, however, on second thought, just the amino acid sequence matters here, have chosen to rephrase my question.

ADD REPLY
0
Entering edit mode

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLY
0
Entering edit mode

Thanks, edited the link

ADD REPLY
0
Entering edit mode

Please follow all the steps in the how-to. Changing the link alone does not embed the image - you need to

  1. Make sure it is added as an image, not as web link
  2. Use imgbb, not a file sharing service like Google Drive.
ADD REPLY
0
Entering edit mode

Second one , the Google drive one, is not an image link, it is a file link, that is why I have used Google drive link. And I think the first link is okay.

ADD REPLY
0
Entering edit mode

The first is not OK. It still has the problem I mentioned. And please share files using GitHub Gist (or directly paste in biostars) if the file is plain text. If the file is neither an image nor can be represented as plain text, please let us know.

ADD REPLY

Login before adding your answer.

Traffic: 1705 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6