Sorry for this long question, but I have facing this issue due to my hardware limitations(I am using windows 7 machine (32 bit) with 4 gb of ram).
I have a random number (and with random name) of .fa files within a folder named 'seq', each of which containing only a single fasta protein sequence, as:
NP_4500.1.fa NP_4568.1.fa NP_45981.3.fa XM_we679.fa 36498746.fa
in another folder named 'db', I made a database fragmented in 200 segments (due to my computational limitations) which are arranged as:
hg.part-001.db hg.part-002.db hg.part-003.db .. .. hg.part-200.db
now I want to run usearch of each sequence against the fragmented database and generate fragmented result, as for one fa file (NP_4500.1.fa):
usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-001.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-001.out usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-002.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-002.out usearch -ublast ./seq/NP_4500.1.fa -db ./db/hg.part-003.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-003.out ... ... usearch -ublast ./seq/NP_4500.1.fa -db hg.part-00200.db -evalue 1e-10 -accel 0.5 -blast6out NP_4500.1_part-00200.out
After that, I want to merge the results in a single file as:
join NP_4500.1_part-001.out NP_4500.1_part-002.out .. NP_4500.1_part-00200.out > NP_4500.1.out
similarly for next seq:
Now, I can run a cmd script for each fasta fike as:
for %%F in ("*.fa") do usearch -ublast ./seq/%%F .......
But my question is, how can I integrate this command with each of the fragmented database and merge the .out files to generate result for a single sequence before proceeding to the next.
I can use cmd, perl or python script. Thanks for ur consideration.