Question: Core proteome using local Blast
0
gravatar for zarodkip
10 months ago by
zarodkip0
zarodkip0 wrote:

Hello all,

I am trying to establish a core proteome of A. baumannii, ie the proteins all of the strains have in common. I have multiple .fasta proteome files.

What would be the most appropriate way of going about this? would this do the trick?:

blastp -query query.fasta -db db -out output.txt -outfmt "6 qseqid qlen sseqid  salltitles pident mismatch gapopen qstart qend qcovs  sstart send evalue bitscore" -evalue 0.00001 -max_target_seqs 5 -num_threads 4

Also, is there a way to run one vs all of my proteomes blast and not one vs one proteome?

I should add I am super new to local blast and using any kind of coding.

local blastp blast core proteome • 255 views
ADD COMMENTlink modified 10 months ago by h.mon32k • written 10 months ago by zarodkip0

What format is your data in? Multi-fasta protein sequence files one per strain? If these are very similar strains (and your dataset is reasonably complete in each case) then you may be able to use CD-HIT to come up with a non-redundant set of proteins which would be equivalent to core proteome.

ADD REPLYlink written 10 months ago by GenoMax96k

Yes, they are multi-fasta sequence files one per strain. I'll look into CD-HIT. Thank you.

ADD REPLYlink written 10 months ago by zarodkip0

I recommend hmmscan (or was it the other hmmsomething) from hmmer against pfam. You get far easier results to interpret this way..

ADD REPLYlink written 10 months ago by 5heikki9.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1960 users visited in the last hour
_