Question: vcftools not ouputting log file when run from perl
0
gravatar for dec986
5 months ago by
dec986210
United States
dec986210 wrote:

I am running 325 vcftools commands to generate Fst values, which obviously needs to be automated.

An example:

vcftools --vcf big.vcf --weir-fst-pop pop_lists/pop1.txt --weir-fst-pop pop_lists/pop2.txt  --out weir_fst_results/pop1_vs_pop2

and when I run this job, it works fine when I run it one by one by the command line, i.e. there are two files:

pop1_vs_pop2.log

and

pop1_vs_pop2.weir.fst

I need to get the Fst values from the log file.

The problem is that the log files are not generated when I run from my perl script, even though the commands executed are identical.

This sounds so trivial, but if it's not producing the log file, I'm wasting my time.

How can I get vcftools to produce the log file when executing from a script? why isn't vcftools producing the log file?

Perhaps there is a way of calculating this that doesn't involve using vcftools?

the script to run them:

sub execute {
    my $command = shift;
    print "Executing: $command\n";
    if (system($command) != 0) {
        say "$command failed.";
        die
    }
}

my @pop = list_regex_files('^[A-Y]{3}\.txt$', 'ethnicity_lists');
my $outdir = 'weir_fst_results';
if (not -d $outdir) {
    mkdir $outdir
}
my $vcf = 'big.vcf';
if (not -e $vcf) {
    say "$vcf doesn't exist.";
    die
}
foreach my $pop1 (0..$#pop) {
    my $label1;
    if ($pop[$pop1] =~ m/^ethnicity_lists\/([A-Y]{3})\.txt$/) {
      $label1 = $1
   } else {
      say "$pop[$pop1] failed regex.";
      die;
   }
    foreach my $pop2 (($pop1+1)..$#pop) {
        my $label2;
        if ($pop[$pop2] =~ m/^ethnicity_lists\/([A-Y]{3})\.txt$/) {
            $label2 = $1
        } else {
            say "$pop[$pop2] failed regex.";
            die;
        }
        my $out_prefix = "weir_fst_results/$label1" . "_vs_$label2";
        my $cmd = "vcftools --vcf $vcf ";
        $cmd .= "--weir-fst-pop $pop[$pop1] --weir-fst-pop $pop[$pop2] ";
        $cmd .= " --out $out_prefix > $out_prefix.out";
        execute($cmd);
    }
}

If I try to execute each comparison in a separate directory, with a separate nohup command, unfortunately all command output is still routed to the source nohup.out, and then the log file is empty, which defeats the purpose of running this.

vcftools vcf • 340 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by dec986210

Can you show us the relevant section of your perl script that runs these lines?

ADD REPLYlink written 5 months ago by RamRS27k

@RamRS I've updated the question

ADD REPLYlink written 5 months ago by dec986210

Try echo instead of execute and check the command. Also, look at the manual page for execute just to make sure STDIN and STDOUT are not being interfered with (not that it makes a difference when the out is explicitly specified, but just in case)

ADD REPLYlink written 5 months ago by RamRS27k

Have you checked if list_regex_files() does work properly?

What is stored in @pop after running line

my @pop = list_regex_files('^[A-Y]{3}\.txt$', 'ethnicity_lists');

?

ADD REPLYlink modified 5 months ago by RamRS27k • written 5 months ago by zubenel80

@zubenel list_regex_files just contains a list of files, nothing else

ADD REPLYlink written 5 months ago by dec986210

A good way of debugging this is to print out the $cmd variable. I guess that it will be different from what you expect.

ADD REPLYlink written 5 months ago by Michael Dondrup47k

the problem is that the exact same command works differently when executed from the perl script. The output file is absent (and thus it is pointless to run vcftools at all). Is there a different program that calculates Fst? Vcftools does not work very well.

ADD REPLYlink written 5 months ago by dec986210

because the output file is absent, but the .fst file is present, is there a way that I can get the weighted score from the fst file?

ADD REPLYlink written 5 months ago by dec986210

Try using system() instead of execute().

ADD REPLYlink written 5 months ago by RamRS27k

unfortunately, that didn't change anything. Is vcftools really the only tool available that can calculate Fst? This problem sounds so trivial, but it's useless without that log file.

ADD REPLYlink written 5 months ago by dec986210

I'm not really sure. Do Fst files contain F-statistics like this one? If so, you can use plink.

It is definitely odd that a command that works from the shell does not work from a system() call. Can you try running the command from a non-interactive shell?

bash --norc --noprofile -c '<command>'

If the above command fails, then we will know the reason the perl call and the shell command work differently - that the shell command relies on some sort of alias or profile setting.

ADD REPLYlink written 5 months ago by RamRS27k

I thought that it might be my perlbrew installation, but that doesn't appear to be the case. The first cycle of parallel runs writes the log files, after that, no more log files, and no explanation. The commands are run identically. Looks like I have to give those other programs a try. Thanks for your help

ADD REPLYlink written 5 months ago by dec986210

"No explanation" could be because STDERR is being suppressed by perl. Try executing the command from perl but this time, suffix the command with >cmd.log 2>&1. This will capture STDERR and STDOUT to the cmd.log file, which will reveal any problems.

ADD REPLYlink modified 5 months ago • written 5 months ago by RamRS27k

I'm running a test now, it should finish in about a day I think (on a small data set)

ADD REPLYlink written 5 months ago by dec986210

the problem is that at some point vcftools stops making the output file, and switches the important info to STDERR. I've never seen another program do this before.

ADD REPLYlink written 5 months ago by dec986210

That's odd, but then this behavior should have been replicated when you ran it from the shell as well. Why do you think that did not happen?

ADD REPLYlink written 5 months ago by RamRS27k
1

Some programs might try to determine if they are running in an interactive shell or not. Under some circumstances, Perl runs programs via execvp instead of using the shell, see https://perldoc.perl.org/functions/system.html This should only happen in case there are no shell metacharacters. You have ">" but to force a shell, you might try "/bin/sh -c " to force an additional layer, just in case the log file generation depends on how the program is invoked.

ADD REPLYlink modified 5 months ago • written 5 months ago by Michael Dondrup47k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1129 users visited in the last hour