Question: (Closed) Find The Match Entries In The Columns And Print Them Out By Perl
0
gravatar for Ar Es
7.0 years ago by
Ar Es0
STK
Ar Es0 wrote:

Hi, I have asked this question before , but as Admin said If i want to change my question , I have to ask new question instead of asking by answering the previous one, Anyhow, my question is :

I want to read through 2 different tab-delimited files of sequence coordinates and find the same columns entries in both files and print it out , one file is ref and the other one contains the the columns that I want to find in the ref file : I want to keep col1 - col2 and col3 of file one (ref file) and col1 - col2 - col 3 of file2

file one (ref file) :

Col1      Col2       Col3     Col4  Col5 ColX
chr        start       end    (rest columns)


file2 :

Col1      Col2       Col3     Col4  Col5 ColX
chr        start       end    (rest columns)

I have written script but I got error when i print it out :

Use of uninitialized value $line2 in split at ./match3.pl line 19, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in string eq at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value $values2[0] in concatenation (.) or string at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in concatenation (.) or string at ./match3.pl line 22, <$fh2> line 2458. Use of uninitialized value in concatenation (.) or string at ./match3.pl line 22, <$fh2> line 2458.

my script is :

#!/usr/bin/perl
use strict;
use warnings;

open my $fh1, '<', 'file1';
open my $fh2, '<', 'file2';

my$line1 ;
my$line2 ;

while (<$fh1>) {
  chomp;
  my @values1 = split( "\t", $line1 );
  close $fh1;
  while (<$fh2>) {
  chomp;
    my @values2 = split( "\t", $line2 );
close $fh2; 

if ( $values1[0] eq $values2[0] and $values1[1] eq $values2[1] and $values1[2] eq $values2[2])

{
  print "$values2[0] \t $values2[1] \t $values2[2] \t \n" ;}}}

Thanks In Advance,

perl comparison • 5.3k views
ADD COMMENTlink modified 7.0 years ago by Jorge Amigo11k • written 7.0 years ago by Ar Es0

Actually, admin said edit your original question. And he also said: please try to format your code properly (indent lines with 4 spaces) to make it readable. Normally admin would then delete this question as a duplicate, but admin can't be bothered right now.

ADD REPLYlink written 7.0 years ago by Neilfws48k

Anyway: I suggest you stop struggling with Perl and just implement this nice awk solution - http://www.unix.com/shell-programming-scripting/150318-matching-columns-two-files.html - just use $1$2$3 instead of $5$6$7.

ADD REPLYlink written 7.0 years ago by Neilfws48k

Thanks for your help admin :) and sorry if I said something wrong

ADD REPLYlink written 7.0 years ago by Ar Es0

No problem. Admins get tired and grumpy sometimes :)

ADD REPLYlink written 7.0 years ago by Neilfws48k

I would like to ask, how much basic programming skills can we expect to accept a question? This script contains at least 7 big mistakes (like even failing to create filehandles properly). So this question has to do a lot with learning perl programming but nothing with bioinformatics. I don't like the general tendency of simply dropping some crappy code here and then have others fix it for you. I understand that OP is simply a beginner in perl, and that might excuse a lot, but I really ask you to ask this question on stack exchange and see how it fares. Closing as off-topic.

ADD REPLYlink written 7.0 years ago by Michael Dondrup45k
2
gravatar for Jorge Amigo
7.0 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

you are using a non optimal way of reading files, since you are reading file2 a number of times equal to the number of lines on file1. what I usually do in these kind of cases (looking for identical columns in several files) is to load the information needed from the smaller file in hashes if memory is not an issue, since perl is optimized for that. this is what I would simply do:

open FILE1, 'file1';
open FILE2, 'file2';
while (<FILE1>) {
    if (/^((\S+\t){3})/) {
        $data1{$1} = "";
    }
}
close FILE1;
while (<FILE2>) {
    if (/^((\S+\t){3})/) {
        if ( exists $data1{$1} ) {
            print "$1 \n" ;
        }
    }
}
close FILE2;

you may modify the regex while reading file1 and file2 in order to store and print whatever you may want to print afterwards

ADD COMMENTlink written 7.0 years ago by Jorge Amigo11k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1955 users visited in the last hour