Question: How To Derive Pan Proteome
1
gravatar for Nari
8.0 years ago by
Nari880
United States
Nari880 wrote:

How to derive pan proteome of 28 microbial species?
I already have core genes / orthologs between all the species.
I also have unique genes of all species.
I only had problem with getting accessory genes/ proteins.
I am not willing to do blast 784 times.
help pls

• 2.9k views
ADD COMMENTlink modified 6.5 years ago by Biostar ♦♦ 20 • written 8.0 years ago by Nari880
6
gravatar for Neilfws
8.0 years ago by
Neilfws49k
Sydney, Australia
Neilfws49k wrote:

First, pan-proteome is not a widely-used term, nor is it very well defined. I assume you mean it in the sense used in this article:

core (found in all) + accessory (found in > 2) + unique (found in 1)

I don't think the 3 sets together are interesting or useful ; pan-proteome is just a term used to describe those 3 sets.

Second, why are you "not willing" to run 784 BLAST searches? Perhaps you mean "not able"? If you're able but not willing, perhaps consider another career ;)

Third, BLAST is not necessarily the best tool. You may find it more useful to cluster the protein sequences using e.g. CD-HIT. In fact, the output from that may come close to giving you the groupings that you need.

ADD COMMENTlink modified 11 months ago by _r_am31k • written 8.0 years ago by Neilfws49k

Thanks @Neilfws: I am a naive to the field (specifically bacterial genomics). I am able to do Blast whatever times it needs. but I wanted some smarter way. that`s what I meant. I want this as a career.:) I saw a paper in which they compared cog distribution in core and pan genomes of some set of species. I wanted to do that for different set of species.And pan genome is considered to be useful in vaccine design against pathogenic strains. [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2752168/]

ADD REPLYlink written 8.0 years ago by Nari880
1
gravatar for bob-lowlow
8.0 years ago by
bob-lowlow40
bob-lowlow40 wrote:

I don't really understand what you mean? How will 784 blasts give you the pan proteome? but if you want to save on blasting stuff, assuming you have amino acid sequence information, you can use this http://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi it's essentially batch blasting

ADD COMMENTlink written 8.0 years ago by bob-lowlow40
1

I assume they mean that all-versus-all BLAST for 28 species = 28 x 28 = 784.

ADD REPLYlink written 8.0 years ago by Neilfws49k

Thanks, for replying, @feargalr I think Neilfws made it clear.

ADD REPLYlink written 8.0 years ago by Nari880

@geargalr: it took 35 mins for a genome .

ADD REPLYlink written 8.0 years ago by Nari880
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1215 users visited in the last hour