Question

How to create a tab delimited file?

2

Entering edit mode

5.2 years ago

Mimmi Ahlmén ▴ 30

Hi! I'm doing something wrong here. I have a long text file that look like this:

AB Ana Biba 1029293.34341

And I want to print out the following to a new tab delimited file:

AB         Ana Biba        1029293.34341

Here's my script. Why doesn't it work?

    my $infile = $ARGV[0];

    open (my $infile, "<", "namconvmars.txt")
    or die "Can't read from $infile: $!";

    my (@group1, @group2, @group3);

    while (<$infile>){
        my @cols = split(/\t/);
        push @group1, @cols[0];
        push @group2, @cols[1];
        push @group3, @cols[2];
        print "@group1\t@group2\t@group3";
    }
    close $infile

Thanks in advance!

perl • 1.8k views

ADD COMMENT • link updated 5.2 years ago by 5heikki 11k • written 5.2 years ago by Mimmi Ahlmén ▴ 30

0

Entering edit mode

how is it related to bioinformatics ?

ADD REPLY • link 5.2 years ago by Pierre Lindenbaum 161k

score 2 · Answer 1 · 2019-03-06

2

Entering edit mode

5.2 years ago

manuel.belmadani ★ 1.3k

You want to split your input on space, not tab.

e.g.

       # my @cols = split(/\t/); # Change this
       my @cols = split(' ');  # To this.

ADD COMMENT • link 5.2 years ago by manuel.belmadani ★ 1.3k

score 1 · Answer 2 · 2019-03-06

1

Entering edit mode

5.2 years ago

Pierre Lindenbaum 161k

BTW, you want:

tr " " "\t" < namconvmars.txt

ADD COMMENT • link 5.2 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

I haven't seen this command before. Awesome!

ADD REPLY • link 5.2 years ago by Robert Sicko ▴ 630

GenoMax · Answer 3 · 2019-03-06

1

Entering edit mode

5.2 years ago

Bill Pearson ★ 1.0k

You do not need three "@group"s -- you either need three scalars ($field0, $field1, $field2) or one @group, which you could print with join("\t",@group);

A simpler solution is to:

while (my $line = <$input>) { 
  chomp($line)
  print join("\t",split(/\s+/,$line),"\n"
}

or

$line =~ s/\s+/\t/;
print $line

ADD COMMENT • link updated 5.2 years ago by GenoMax 141k • written 5.2 years ago by Bill Pearson ★ 1.0k

score 1 · Answer 4 · 2019-03-06

1

Entering edit mode

5.2 years ago

JC 13k

There are some Perl-ings you need to understand first:

my $infile = $ARGV[0];

This line reads the first command line argument after your script name and pass to the variable $infile

open (my $infile, "<", "namconvmars.txt")
or die "Can't read from $infile: $!";

You are declaring again $infile (that is what my does), also you are reusing the variable to be a file pointer. So, you don't need the first line my $infile = $ARGV[0] because you never used it.

my (@group1, @group2, @group3);

while (<$infile>){
    my @cols = split(/\t/);
    push @group1, @cols[0];
    push @group2, @cols[1];
    push @group3, @cols[2];
    print "@group1\t@group2\t@group3";
}
close $infile

On this part I think you want to collect the values, but if your intention is to simply convert each line, you don't need the arrays, just read, modify and print each line. The complex part I see, when you split the line using spaces, the second element is splitted too ("Ana Biba" -> ["Ana", "Biba"], to avoid this you will need to reconstruct that element. Something like:

#!/usr/bin/perl
use strict;
use warnings;
my $file = "namconvmars.txt";
open (my $infile, "<", $file)
or die "Can't read from $file";

while (<$infile>){
    my @cols = split(/\s+/, $_);  # break line using spaces
    my $first = shift(@cols);  # grab first element
    my $last  = pop(@cols); # grab last element
    my $mid   = join " ", @cols; # reconstruct middle element
    print join "\t", $first, $mid, $last;
}
close $infile

ADD COMMENT • link 5.2 years ago by JC 13k

0

Entering edit mode

Thank you so much!

Actually, my file has several elements that looks the same:

XX Xxxxx_Xxxx YyyYy
XY Xyxyx_Xyxyx YxYx

So I need to go to then next row after each row. How do I do this?

ADD REPLY • link 5.2 years ago by Mimmi Ahlmén ▴ 30

1

Entering edit mode

It's complaining about or die "Can't read from $infile: $!";. Which makes sense, if open (my $infile, "<", "namconvmars.txt") fails for some reason, then $infile wont be set, so you can't use it in your error message (which would print the content of the file anyways, probably not what you wanted.) You weren't seeing this error originally because you were declaring $infile before the open statement, so you were making sure it was declared even if open fails.

You probably want to do something like:

my $filename = "namconvmars.txt"; # Or set it via $ARGV
open (my $infile, "<", $filename) or die "Can't read from '$filename' !";

So if for some reason $filename is not readable, you'll see: Can't read from 'namconvmars.txt' ! at tabs.pl line 6.

ADD REPLY • link 5.2 years ago by manuel.belmadani ★ 1.3k

0

Entering edit mode

true, I modify the code to read the file name from another var

ADD REPLY • link 5.2 years ago by JC 13k

0

Entering edit mode

The while (<$infile>) {} loop reads the file line per line

ADD REPLY • link 5.2 years ago by JC 13k

score 1 · Answer 5 · 2019-03-07

1

Entering edit mode

5.2 years ago

5heikki 11k

awk 'BEGIN{FS=" ";OFS="\t"}{print $1,$2" "$3,$4}' in > out

edit. More general solution where the first and last space are replaced with tabs

awk 'BEGIN{FS=" "}{L=$NF; NF--; sub(" ","\t",$0); print $0"\t"L}' in > out

ADD COMMENT • link 5.2 years ago by 5heikki 11k

score 1 · Answer 6 · 2019-03-07

1

Entering edit mode

5.2 years ago

cpad0112 21k

with sed:

$ sed 's/\s\+/\t/g' test.txt          
AB  Ana Biba    1029293.34341

ADD COMMENT • link 5.2 years ago by cpad0112 21k