Hi, first time posting here. I have 126 protein alignments based on ortho groups from OrthoMCL, from which I want to build a multi gene phylogeny. The majority of them are single copy per species, but there are many that have doubles. For concatenation purposes, I need 1 copy per species. I came across this script that enabled me to get rid of duplicates (based on the fasta header, not sequence identity) while appending the additional information (accession numbers) from the removed sequences: (emboldened sections are where you enter your file)
perl -ne 'if (/>(.*?)\s+(.*)/){push(@{$hash{$1}},$2) ;}}{open(I, "<","file.fa");
while(<I>){if(/>(.*?)\s+/){ $t = 0; next if $h{$1};
$h{$1} = 1 if $hash{$1};
$t = 1;
chomp;
print $_ . " @{$hash{$1}}\n"}elsif($t==1){print $_} } close I;
' file.fa
This works perfectly fine when applied to one file, however when I try to apply it to all 126 files with a loop:
for file in ./*.fasta;
do perl -ne 'if (/>(.*?)\s+(.*)/){push(@{$hash{$1}},$2) ;}}{open(I, "<","$file");
while(<I>){if(/>(.*?)\s+/){ $t = 0; next if $h{$1};
$h{$1} = 1 if $hash{$1};
$t = 1;
chomp;
print $_ . " @{$hash{$1}}\n"}elsif($t==1){print $_} } close I;
' "$file" > single/"$file".1;
done
It gives the me the expected output files (files with same names with ".1" appended), but the files are empty. I'm new to scripting and know even less about perl, so I'm not particularly sure where it's going wrong. Especially when this loop works on other scripts/commands I've used on large amounts of files. Any help is appreciated. JT
Start by writing a perl script. One liners are hard to debug, even harder when you mix languages.
I like Perl, but here I have to agree with Perl detractors: these "one-liners" look like a keyboard puked on the screen.
(For a few dollar signs more)
The lyrics are a one-liner perl script.
jt500 : For future reference. Please use the formatting bar (especially the
code
option) to present your post better. I had done it for you but you seem to have re-edited the post back to where it was before.Thank you!