Question

error when concantenating eggNog hmmer file

0

Entering edit mode

8.6 years ago

mwanerhi erfgtr ▴ 30

I am trying to find orthologs between my 6000 functionally annotated protein sequences using the eggNOG database, http://eggnogdb.embl.de/#/app/seqscan, I have followed that step by step protocol but when I came to concatenating I got an error /bin/cat: Argument list too long. how can I concatenate these files before I run the analysis?

I am the super user so not denial

next-gen • 3.3k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

Th exact command (as you typed it in your terminal, not what's on a tutorial) would help.

ADD REPLY • link 8.6 years ago by Ram 43k

score 0 · Answer 1 · 2015-09-02

0

Entering edit mode

8.6 years ago

mwanerhi erfgtr ▴ 30

The command i typed is cat bactNOG_hmm/*.hmm>bactDB.hmmer

ADD COMMENT • link 8.6 years ago by mwanerhi erfgtr ▴ 30

Ram · Answer 2 · 2015-09-02

0

Entering edit mode

8.6 years ago

James Ashmore ★ 3.4k

If you're listing files using ls and a wildcard statement then it's important to note that there is a limit to the number of arguments that can be supplied to a command. Instead use the commands find and xargs to find all the files you want and xargs to supply them to your command. For your specific example:

find bactNOG_hmm -name "*.hmm" | xargs cat > bactDB.hmmer

Also, I recommend you pick up a copy of Bioinformatics Data Skills by Vince Buffalo, this is where I learnt about find and xargs, lots of useful information.

ADD COMMENT • link 8.6 years ago by James Ashmore ★ 3.4k

0

Entering edit mode

I have tried your command, but it gives me an error permission denied, so i rerun the command as super user, but it still tells me bactNOG: is a directory,

what am i doing wrong?

Thanks for the book, i will get it

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

Either try running the command in the directory above the bactNOG_hmm directory, or provide the full directory path in place of just bactNOG_hmm

ADD REPLY • link 8.6 years ago by James Ashmore ★ 3.4k

0

Entering edit mode

Thanks , that seems to have worked. i am quite new to the ubuntu environment.

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

I have another problem, i have been trying figure oout for the whole eveining, i am trying to identify otholgs between four strains a,b,c,d, using thier protein sequences, i looked around internet and out of the so many options i went with proteinortho https://www.bioinf.uni-leipzig.de/Software/proteinortho/manual.html, i have done everything. but i when i try to run the .pl command, i get an error that its not there, but i can see it. i have tried using the full path but still nothing has happened. what do i seem to be doing wrong?

tx

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

Can you show me which commands you used to run the perl script, and is the perl script in the same directory where you run the command?

ADD REPLY • link 8.6 years ago by James Ashmore ★ 3.4k

0

Entering edit mode

so download the tar file, untar and in the folder containing the perl scripts i copy a folder - samuel containing the fasta files for analysis, which are a,b,c,d and c. so then in the same folder i call the script from the terminal with this command $ proteinortho5.pl -project= samuel/*.faa.

it returns an error : command not found

i tried writing the whole path but still not found. but even when i try using the tab on the key board to auto complete the proteinortho5.pl, it doesnt . so i am not sure what i am doing wrong

Thanks for trying to help me

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

Okay, first you need to make sure you have permission to run the proteinortho5.pl file, you can set these using:

chmod u+x  proteinortho5.pl

After that try running the script using:

perl proteinortho5.pl

ADD REPLY • link 8.6 years ago by James Ashmore ★ 3.4k

0

Entering edit mode

You are an absolute genius, it worked after so many times, requiring to change names

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

tx very much

ADD REPLY • link 8.6 years ago by mwanerhi erfgtr ▴ 30

0

Entering edit mode

Hi, so the process worked and it produced a graph, and a table called myproject.proteinortho. I want to to produce another file of orthologous proteins separated for each individual strain a,b,c, d. according to the manual I thought it could be grab_proteins.pl

So the command was perl grab_proteins.pl samuel/*.faa myproject.proteinortho

It gives me an error:

fasta2hash(): could not open file: .//L228plasmid.faa

but nothing has changed from the previous files used

What am I doing wrong here?

nb: manual command:

perl grab_proteins.pl [options] proteinortho_output_table
options: -path=path to protein files

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by mwanerhi erfgtr ▴ 30