B2G4Pipe (Command-Line Blast2Go) Question About Output / Throughput
Entering edit mode
10.7 years ago
JacobS ▴ 980

I am new to BLAST2GO in general, and am using b2g4pipe, and am puzzled by the output.

Background info: I have lots of XML-formatted BLASTX output, and now all I want to do is associate my BLASTX results with GO-terms when possible. So, I ran b2g4pipe using the following command:

java -Xmx50000m -cp *:ext/*: es.blast2go.prog.B2GAnnotPipe -in $1 -prop b2gPipe.properties -annot -v

This is working well (albeit slowly), but produces confusing results. For a single query, there are between 1-10 entries in the output, all with different GO-terms. For example:

QUERY_ID_0001      GO:0009060      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0005743      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0070469      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0009055      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0020037      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0016021      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0022900      cytochrome oxidase subunit partial
QUERY_ID_0001      GO:0004129      cytochrome oxidase subunit partial

1) From looking through the BLASTX output, it seems that QUERY_ID_0001 only has a single HSP, so why are there so many different GO terms for this query? Since the terms are all different, are these simply different hierarchical GO-term categorizations?

2) If a particular query has several HSPs (say 5), will b2g4pipe generate a new set of GO-terms for each of the HSPs, or will it simply take the best HSP and analyse that?

3) I am using the public mysql GO database for querying my BLAST results, and this seems to be going slowly. Will a local installation of the GO mysql database speed up this process? I have several hundred million queries, and want a very high-throughput method of annotating my BLASTX queries. Will increasing the available memory for java to something like 100G help very much?

Thanks for any answers / suggestions!!

EDIT: no ideas? all suggestions are appreciated!

annotation go • 5.2k views
Entering edit mode

You might find asking this on the Blast2GO Google Group more helpful, since the Blast2GO developers interact with users there.

Entering edit mode

I'm running into the same problem #3 - very slow mapping of blastx results with the b2g4pipe. Did you find that a local installation increased the speed enough to make this a feasible approach? Does increasing the memory to java make a difference? Thanks!

Entering edit mode
10.7 years ago
Björn ▴ 670

I will give it a try, but I'm not an expert in b2g.

1) I would assume that is correct, every time you map against GO you will get multiple hits, representing the GO tree.

2) b2g offers many parameters for example HSP length, evalue and so on, so I would assume it will take more than the best HSP

3) Yes. If you have so many queries you should use the local version. Afaik Java is just the shell that controls Blast and the MySQL queries. 100GB for Java makes no sense, imho. Try to configure MySQL a little bit (more cache, etc.) and run Blast with multiple processors, that will help much more. You will find many more tuning parameters in the b2g config file.


Login before adding your answer.

Traffic: 2159 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6