Running hmmsearch against all fasta files in a directory
0
0
Entering edit mode
9 months ago
garfield320 ▴ 20

I'm trying to run hmmsearch against a set of fasta files stored in a directory and save the results individually. I came up with the following python script:

# extract fasta file names for input/output files
Filename = []
for item in os.listdir("mnt/c/Users/username/Desktop/FASTA"):
  if ".fasta" in item:
     Filename.append(item)

__INPUT = Filename
__OUTPUT = mnt/c/Users/username/Desktop/hmmsearch/Filename.hmm

hmmsearch -o $__OUTPUT -T 50 hmmfile.hmm $__INPUT

However I can't even seem to get the Filename parameter out because I'm getting the no such file or directory error. I did manage to run hmmsearch for individual fasta file using the line below though , so I know that the file path does exist.

hmmsearch -o /mnt/c/Users/username/Desktop/hmmsearch/hmmoutput.txt -T 50 hmmfile.hmm /mnt/c/Users/username/Desktop/FASTA/file1.fasta

How can I get around this problem?

hmmsearch • 769 views
ADD COMMENT
0
Entering edit mode

Are you running this from the root (/) directory, which is where I would expect /mnt/.... to be present. Otherwise you appear to be missing a leading / before mnt/c/Users/username/Desktop/hmmsearch/Filename.hmm and in mnt/c/Users/username/Desktop/FASTA .

ADD REPLY
0
Entering edit mode

No I'm not in the root directory, I just added the missing / in the os.listdir command ("/mnt/c/Users/username/Desktop/FASTA") and that solved the issue with Filename not getting extracted correctly! However when I tried adding / to the __OUTPUT file it gives a syntax error (invalid syntax), so I removed that again and left it as is. Now when I run the file, I get a syntax error with the hmmsearch line, saying that $__OUTPUT is an invalid syntax?

ADD REPLY
0
Entering edit mode

By using Filename everywhere you are simply using the entire list as input. You would want to use item to select those names that have .fa in the name? You will also need to do some additional work to remove .fa from name before using the basename.

ADD REPLY
0
Entering edit mode

Hmm, after reading your comment I tried using this for the __INPUT instead, and I'm getting a syntax error at *.fasta. I thought * was supposed to read any characters or numbers though?

__INPUT = mnt/c/Users/username/Desktop/FASTA/*.fasta

ADD REPLY
0
Entering edit mode

This is a horrible implementation but should give you an idea. I am simply reusing your code. You will need to adjust as necessary.

Filename = []
for item in os.listdir("/mnt/c/Users/username/Desktop/FASTA/"):
    if ".fasta" in item:
            Filename.append(item)
            __INPUT = item
            name=(__INPUT.split("."))
            __OUTPUT = str(name[0])+".hmm"
            print(f"hmmsearch -o {__OUTPUT} -T 50 hmmfile.hmm {__INPUT}")
ADD REPLY

Login before adding your answer.

Traffic: 1937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6