Question: Compare two proteome
0
gravatar for sontiroy
18 months ago by
sontiroy0
sontiroy0 wrote:

I want to do a whole proteome comparison of two organisms. The size of the genome of one organism is around 2.5 Gb and another is 8 Mb. So please suggest me some tools and if it is GUI based on windows it's better.

alignment genome • 578 views
ADD COMMENTlink written 18 months ago by sontiroy0

You'll have to be more specific. What kind of comparison do you have in mind and for what purpose ? You want to compare proteomes but you mention genomes. Does this mean that your genomes are not annotated or that you want to compare the protein-coding parts of the genomes or something else ?

ADD REPLYlink written 18 months ago by Jean-Karim Heriche21k

The genome of both the organism has been sequenced and protein has been annotated. I want to do a comparison to find similarities between the organism at the protein level.

ADD REPLYlink modified 18 months ago • written 18 months ago by sontiroy0

Hopefully the protein complements are of a comparable size. That is a huge discrepancy between the genome sizes.

"GUI based on windows" may pretty much eliminate a reasonable chance of finding "a" program that is freely available.

ADD REPLYlink written 18 months ago by genomax74k

Linux based software will also be fine but I don't have much experience in Command line. so I thought it would time consuming. Can you suggest some Linux based software?

ADD REPLYlink written 18 months ago by sontiroy0
1

Have you considered using blastp, blat, DIAMOND. They are all command line programs though.

ADD REPLYlink written 18 months ago by genomax74k

I want to compare 3700 protein sequences to approx 27000 protein sequence. These all have cut off which is way below.

ADD REPLYlink written 18 months ago by sontiroy0

These all have cut off which is way below.

Not sure what you mean by that?

ADD REPLYlink written 18 months ago by genomax74k

ok, tell me how to compare proteome of two organisms, one having 3700 protein sequence and another having 27000 protein sequences and find out which proteins are around 90% identical or above .

ADD REPLYlink modified 18 months ago • written 18 months ago by sontiroy0

blastp or diamond, as genomax wrote is what you need. You won't get a better or easier solution. Choose tabluar outformat and apply excel filters for your thresholds

ADD REPLYlink modified 18 months ago • written 18 months ago by Carambakaracho1.9k

You will search them against each other (using one set as a query and other as database). blat may be the easiest (sounds like you don't want/need evolutionarily distant entities) with -minIdentity=90 option. Use a tabular format so you can parse results if needed.

Something similar can also be done using blastp (if you want more distant relationships) but will be more involved (look into pident option for this) with -outfmt 6.

ADD REPLYlink modified 18 months ago • written 18 months ago by genomax74k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1088 users visited in the last hour