Question: Compare consecutive columns of a phased Beagle file to generate the number of elements that matches.
0
gravatar for aritra90
3.5 years ago by
aritra9020
United States
aritra9020 wrote:

I have a Beagle phased output and I want to compare consecutive columns of a file and return the number of matched elements. I would prefer to use shell scripting or awk. Here is a sample bash/AWK script that I am trying to use.

!/bin/bash
for i in 3 4 5 6 7 8 9
do
  for j in 3 4 5 6 7 8 9
   do
    awk "$i == $j" phased.txt | wc -l
  done
done

I have a file of size 147189*828 and I want to compare each columns and return the number of matched elements in a 828*828 matrix(A similarity matrix). This would be fairly easy in MATLAB, but, it takes a long time to load huge files. I can compare two columns and return the number of matched elements with the following awk command: awk '$3==$4' phased.txt | wc -l, but would need some help to do it for the entire file.

 

A snippet of the data: 

# sampleID   HGDP00511  HGDP00511   HGDP00512   HGDP00512   HGDP00513   HGDP00513   

M rs4124251       0                     0                      A                     G                  0                        A

M rs6650104       0                     A                      C                     T                  0                        0

M rs12184279      0                    0                      G                      A                 T                        0

...........................................................................................................................................................

bash shell genotype awk beagle • 1.6k views
ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by aritra9020

Always show a snippet of data, as I have no idea what a phased beagle file is, but I can help you with comparison.

ADD REPLYlink written 3.5 years ago by Sukhdeep Singh9.5k

Hi Sukhdeep, 

Thanks for reaching out. I have posted a snippet of the sample data. Your help is much appreciated. 

 

ADD REPLYlink written 3.5 years ago by aritra9020
0
gravatar for aritra90
3.5 years ago by
aritra9020
United States
aritra9020 wrote:

SOLVED. 

I was missing the \$$ 

Thanks :)

ADD COMMENTlink written 3.5 years ago by aritra9020
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1287 users visited in the last hour