I am trying to find orthologs between my 6000 functionally annotated protein sequences using the eggNOG database, http://eggnogdb.embl.de/#/app/seqscan, I have followed that step by step protocol but when I came to concatenating I got an error /bin/cat: Argument list too long. how can I concatenate these files before I run the analysis?
If you're listing files using ls and a wildcard statement then it's important to note that there is a limit to the number of arguments that can be supplied to a command. Instead use the commands find and xargs to find all the files you want and xargs to supply them to your command. For your specific example:
Also, I recommend you pick up a copy of Bioinformatics Data Skills by Vince Buffalo, this is where I learnt about find and xargs, lots of useful information.
I have tried your command, but it gives me an error permission denied, so i rerun the command as super user, but it still tells me bactNOG: is a directory,
I have another problem, i have been trying figure oout for the whole eveining, i am trying to identify otholgs between four strains a,b,c,d, using thier protein sequences, i looked around internet and out of the so many options i went with proteinortho https://www.bioinf.uni-leipzig.de/Software/proteinortho/manual.html, i have done everything. but i when i try to run the .pl command, i get an error that its not there, but i can see it. i have tried using the full path but still nothing has happened. what do i seem to be doing wrong?
so download the tar file, untar and in the folder containing the perl scripts i copy a folder - samuel containing the fasta files for analysis, which are a,b,c,d and c. so then in the same folder i call the script from the terminal with this command $ proteinortho5.pl -project= samuel/*.faa.
it returns an error : command not found
i tried writing the whole path but still not found. but even when i try using the tab on the key board to auto complete the proteinortho5.pl, it doesnt . so i am not sure what i am doing wrong
Hi, so the process worked and it produced a graph, and a table called myproject.proteinortho. I want to to produce another file of orthologous proteins separated for each individual strain a,b,c, d. according to the manual I thought it could be grab_proteins.pl
So the command was perl grab_proteins.pl samuel/*.faa myproject.proteinortho
It gives me an error:
fasta2hash(): could not open file: .//L228plasmid.faa
but nothing has changed from the previous files used
What am I doing wrong here?
nb: manual command:
perl grab_proteins.pl [options] proteinortho_output_table
options: -path=path to protein files
Th exact command (as you typed it in your terminal, not what's on a tutorial) would help.