Question: error when concantenating eggNog hmmer file
0
gravatar for mwanerhi  erfgtr
4.1 years ago by
United States
mwanerhi erfgtr30 wrote:

I am trying to find orthologs between my 6000 functionally annotated protein sequences using the eggNOG database, http://eggnogdb.embl.de/#/app/seqscan, i have followed that step by step protocol but when i came to cancatenating i got an error /bin/cat: Argument list too long. how can i concantenate these files before i run the analysis?

i am the super user so not denial 

next-gen • 1.4k views
ADD COMMENTlink modified 4.1 years ago by James Ashmore2.7k • written 4.1 years ago by mwanerhi erfgtr30

Th exact command (as you typed it in your terminal, not what's on a tutorial) would help.

ADD REPLYlink written 4.1 years ago by RamRS24k
0
gravatar for mwanerhi  erfgtr
4.1 years ago by
United States
mwanerhi erfgtr30 wrote:

The command i typed is cat bactNOG_hmm/*.hmm>bactDB.hmmer

ADD COMMENTlink written 4.1 years ago by mwanerhi erfgtr30
0
gravatar for James Ashmore
4.1 years ago by
James Ashmore2.7k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.7k wrote:

If you're listing files using ls and a wildcard statement then it's important to note that there is a limit to the number of arguments that can be supplied to a command. Instead use the commands find and xargs to find all the files you want and xargs to supply them to your command. For your specific example:

find bactNOG_hmm -name "*.hmm" | xargs cat > bactDB.hmmer

Also, I recommend you pick up a copy of Bioinformatics Data Skills by Vince Buffalo, this is where I learnt about find and xargs, lots of useful information.

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by James Ashmore2.7k

I have tried your command, but it gives me an error permission denied, so i rerun the command as super user, but it still tells me bactNOG: is a directory,

what am i doing wrong?

Thanks for the book, i will get it

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30

Either try running the command in the directory above the bactNOG_hmm directory, or provide the full directory path in place of just bactNOG_hmm

ADD REPLYlink written 4.1 years ago by James Ashmore2.7k

Thanks , that seems to have worked. i am quite new to the ubuntu environment.

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30

I have another problem, i have been trying figure oout for the whole eveining, i am trying to identify otholgs between four strains a,b,c,d, using thier protein sequences, i looked around internet and out of the so many options i went with proteinortho https://www.bioinf.uni-leipzig.de/Software/proteinortho/manual.html, i have done everything. but i when i try to run the .pl command, i get an error that its not there, but i can see it. i have tried using the full path but still nothing has happened. what do i seem to be doing wrong?

tx

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30

Can you show me which commands you used to run the perl script, and is the perl script in the same directory where you run the command?

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by James Ashmore2.7k

so download the tar file, untar and in the folder containing the perl scripts i copy a folder - samuel containing the fasta files for analysis, which are a,b,c,d and c. so then in the same folder i call the script from the terminal with this command $ proteinortho5.pl -project= samuel/*.faa.

it returns an error : command not found

i tried writing the whole path but still not found. but even when i try using the tab on the key board to auto complete the proteinortho5.pl, it doesnt . so i am not sure what i am doing wrong

Thanks for trying to help me

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30

Okay, first you need to make sure you have permission to run the proteinortho5.pl file, you can set these using:

chmod u+x  proteinortho5.pl

After that try running the script using:

perl proteinortho5.pl
ADD REPLYlink written 4.1 years ago by James Ashmore2.7k

You are an absolute genius, it worked after so many times, requiring to change names

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30

tx very much

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by mwanerhi erfgtr30

Hi, so the process worked and it produced a graph, and a table called myproject.proteinortho. i want to to produce another file of orthologous proteins separated for each individual strain a,b,c, d. according to the manual i thought it could be  grab_proteins.pl

so the command was perl grab_proteins.pl samuel/*.faa myproject.proteinortho

it gives me an error: fasta2hash(): could not open file: .//L228plasmid.faa

but nothing has changed from the previous files used

what am i doing wrong here?

nb: manual command : perl grab_proteins.pl [options] proteinortho_output_table

options: -path=path to protein files

ADD REPLYlink written 4.1 years ago by mwanerhi erfgtr30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1271 users visited in the last hour