Question: concatenate multiple GZip fastq files from multilane run and output combined gzip file
0
gravatar for PAn
19 months ago by
PAn10
United States
PAn10 wrote:

I need to write a perl script to read gzipped fastq files from a text file list of their paths and then concatenate them together and output to a new gzipped file. ( I need to do this in perl as it will be implemented in a pipeline) I am not sure how to accomplish the zcat and concatenation part, as the file sizes would be in Gbs, I need to take care of the storage and run time as well.

So far I can think of it as -

use strict;
use warnings;
use IO::Compress::Gzip qw(gzip $GzipError) ;

#-------check the input file specified-------------#

$num_args = $#ARGV + 1;
if ($num_args != 1) {
    print "\nUsage: name.pl Filelist.txt \n";
exit;

$file_list = $ARGV[0];

#-------------Read the file into arrray-------------#

my @fastqc_files;   #Array that contains gzipped files 
use File::Slurp;
my @fastqc_files = $file_list;


#-------use the zcat over the array contents 
my $outputfile = "combined.txt"
open(my $combined_file, '>', $outputfile) or die "Could not open file '$outputfile' $!";

for my $fastqc_file (@fastqc_files) {

    open(IN, sprintf("zcat %s |", $fastqc_file)) 
      or die("Can't open pipe from command 'zcat $fastqc_file' : $!\n");
    while (<IN>) {
        while ( my $line = IN ) {
          print $outputfile $line ;
        }
    }
    close(IN);

my $Final_combied_zip = new IO::Compress::Gzip($combined_file);
  or die "gzip failed: $GzipError\n";

Somehow I am not able to get it to run. Can anyone share if there is simpler/ correct method to accomplish this? Thanks!

sequencing next-gen fastqc perl • 672 views
ADD COMMENTlink written 19 months ago by PAn10
1

using zcat and compressing is useless :  A: How To Merge Two Fastq.Gz Files?

ADD REPLYlink written 19 months ago by Pierre Lindenbaum95k

What would be better way to combine gzip files then? I need to basically stitch them together, not just combine gzip files into one big gzip file (and I need to take the GBs size of file into account too)

ADD REPLYlink modified 19 months ago • written 19 months ago by PAn10

The point is that "stitching them together" just means concatenating them. There is no difference. You can do this in one line (without perl) as Pierre's comment suggests.  

ADD REPLYlink written 19 months ago by Sean Davis23k

Thanks Pierre and Sean, I understand its better to run it as one line command rather than perl but I really need to run it in perl, as I need to implement it in a pipeline which has other components, config files and XML caller etc. I will give it another shot, else will tell the collaborators to settle with one liner ( I prefer it as well)!

ADD REPLYlink written 19 months ago by PAn10

You can run a shell command from perl.  A little googling will tell you how.

ADD REPLYlink written 19 months ago by Sean Davis23k
1

Thanks Sean, yes I got it running by simply using system zcat command in script. Thanks!

#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp;


my @data = read_file('./File_list.txt');
my $out = "./test.txt";
 

foreach my $data_file (@data)

{
    chomp($data_file);
    system("zcat $data_file >> $out");
}

 

 

ADD REPLYlink modified 18 months ago • written 18 months ago by PAn10

Glad it worked out for you.  Remember to "remove" your output file before entering the loop so that if the script has failed, you don't simply append to the "bad" file.

ADD REPLYlink written 18 months ago by Sean Davis23k

Oh yes, thats right. Thanks for pointing it out. I have another question - can I use ARGV for input-file instead of specifying it in script. I tried modifying the script to 

#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp;

my @data = read_file(ARGV[0]);
 instead of specifying the path for input file 

 

 but it shows error, can you please point out,  sorry it must be very trivial. 

Thanks!

 

ADD REPLYlink modified 18 months ago • written 18 months ago by PAn10

You'll definitely need to do a little reading on arguments in perl.  For example:

http://alvinalexander.com/perl/perl-command-line-arguments-read-args

ADD REPLYlink written 18 months ago by Sean Davis23k

 Yes working on the perl basics. Thanks, it works now!

ADD REPLYlink written 18 months ago by PAn10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 495 users visited in the last hour