Question: (Closed) common between in the coloumns
0
gravatar for Bulbul Ahmed
3.3 years ago by
Bulbul Ahmed20
United States
Bulbul Ahmed20 wrote:
s1  s2  s3  s4  s5  s6  

a    b    a   a    a     c
c    a     b   b    b    a
b    c     c   c    c    b.

in the above a,b,c in common in each coloumn. but i have s1, s2, s3, s4, s5 coloumn with lakhs of entries. how should i write a perl script to find out the entries are common in all coloumn. plsease suggest me..

perl • 675 views
ADD COMMENTlink modified 3.3 years ago by sviatoslav.kendall510 • written 3.3 years ago by Bulbul Ahmed20
2

First of all the question is not very clear, then this is not very much related to a biological query that can be addressed here. It is more of a stackoverflow question , but still if you can reframe a bit and give a motivation as what you want to do and why and what have your tried people might still be able to help you, It is intelligible as to what entries corresponds to each column.

ADD REPLYlink written 3.3 years ago by ivivek_ngs4.8k

Hello Bulbul Ahmed!

We believe that this post does not fit the main topic of this site.

Not a bioinformatics question

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink written 3.3 years ago by RamRS24k
1
gravatar for ivivek_ngs
3.3 years ago by
ivivek_ngs4.8k
Seattle,WA, USA
ivivek_ngs4.8k wrote:

Just to add , am not sure if this will be accepted by the community. If you load the file in R with required memory then then each column with be acting as a vector and then you can do something like this in R:

s1<-c("a","c","b")
s2<- c("b","a","c")
s3<- c("a","b","c")
s4<- c("a","b","c")
s_com<-Reduce(intersect, list(s1,s2,s3,s4))

Something like this should work

P.S: You have to load your file in R as header=T and mention the character as string

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by ivivek_ngs4.8k
0
gravatar for sviatoslav.kendall
3.3 years ago by
United States
sviatoslav.kendall510 wrote:

Here's a perl solution.

#!/usr/bin/perl

use warnings;
use strict;
use Data::Dumper;

my $file = shift @ARGV; # SUPPLY COLUMN FILE AT COMMAND LINE
my %hash;
open (FILE, $file);
while (<FILE>) {
    my $counter = 0;
    my $line = $_;
    chomp $line;
    my @columns = split("\t", $line); # ASSUMES FILE IS TAB-DELIMITED
    foreach (@columns) {
        $counter++;
        $hash{$counter}{$_} = 1; # LOGS EACH UNIQUE VALUE IN EACH COLUMN
    }
}
print Dumper(\%hash); # THIS LINE IS FOR THE BENEFIT OF THE ORIGINAL POSTER
my @shared;
for my $x (keys $hash{'5'}) {
    if (exists $hash{'1'}{$x} && $hash{'2'}{$x} && $hash{'3'}{$x} && $hash{'4'}{$x} ) {
        push(@shared, $x);
    }
}
my $output = join("\n", @shared);
print "These values are found in all columns:\n$output";
ADD COMMENTlink written 3.3 years ago by sviatoslav.kendall510
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 914 users visited in the last hour