Question: (Closed) How to repeat a row for each value of its corresponding column
0
gravatar for ak93sharma
2.9 years ago by
ak93sharma10
ak93sharma10 wrote:

hello folks , i am not expert in programming , I want to repeat a row for all its corresponding values any help

my input file is like this

pos COL1    COL2    COL3
18691441    C   A   G
18691572    G   C   G
18691620    A   T   G
18691716    C   G   C

i want output like this

pos COL1    
18691441    C   
18691441    A   
18691441    G   
18691572    G   
18691572    C   
18691572    G   
18691620    A   
18691620    T   
18691620    G   
18691716    C   
18691716    G   
18691716    C

i am trying to repeat a row but it simply makes them duplicate , i am using

while read line; do for i in {1..3}; do echo "$line"; done; done < real2.txt

and gives output:

pos COL1    COL2    COL3
18691441    C   A   G
18691441    C   A   G
18691441    C   A   G
18691572    G   C   G
18691572    G   C   G
18691572    G   C   G
18691620    A   T   G
18691620    A   T   G
18691620    A   T   G
18691716    C   G   C
18691716    C   G   C
18691716    C   G   C

then i extracted pos from input 1.txt file and make 1_pos.txt and write something like this:

 pos
18691441
18691572
18691620
18691716
for i in `cat 1_post.txt`;
do
x=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$2}' ) ;
y=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$3}' ) ;
z=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$4}' ) ;
echo -e "$x""\n""$y""\n""$z";
done

this gives me output, without col information:

18691441    C
18691441    A
18691441    G
18691572    G
18691572    C
18691572    G
18691620    A
18691620    T
18691620    G
18691716    C
18691716    G
18691716    C
bash shell awk linux perl • 804 views
ADD COMMENTlink modified 2.9 years ago by cpad011212k • written 2.9 years ago by ak93sharma10
1

Hello ak93sharma!

We believe that this post does not fit the main topic of this site.

This is a general programming question. Please search StackOverflow.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink written 2.9 years ago by RamRS26k

Hi ak93sharma,

This is more of a general programming question, rather than a bioinformatics specific question. In future questions like these should be posted to stack-overflow or similar forums. Regardless, here's a link to a quick Python script to do what you want. Have a good day.

ADD REPLYlink written 2.9 years ago by James Ashmore2.8k

Thanks @James , but i want columns names too, any help

ADD REPLYlink written 2.9 years ago by ak93sharma10
1
gravatar for cpad0112
2.9 years ago by
cpad011212k
India
cpad011212k wrote:

For R users:

data

 pos COL1 COL2 COL3
1 18691441    C    A    G
2 18691572    G    C    G
3 18691620    A    T    G
4 18691716    C    G    C
  1. saved it as TSV
  2. Code in R
library(tidyr)  
library(dplyr) 
test=read.csv("test.csv",sep="\t",stringsAsFactors = F) 
final=arrange(gather(test, "","COL",COL1:COL3)[,c(1,3)],pos)
final
  

Final output:

        pos COL
1  18691441   C
2  18691441   A
3  18691441   G
4  18691572   G
5  18691572   C
6  18691572   G
7  18691620   A
8  18691620   T
9  18691620   G
10 18691716   C
11 18691716   G
12 18691716   C
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by cpad011212k
0
gravatar for 5heikki
2.9 years ago by
5heikki8.7k
Finland
5heikki8.7k wrote:
awk 'BEGIN{OFS=FS="\t"}{for(i=2;i<=NF;i++)print $1,$i}' fileWoutHeader
ADD COMMENTlink written 2.9 years ago by 5heikki8.7k

thanks , can you add columns information too.

ADD REPLYlink written 2.9 years ago by ak93sharma10
awk 'BEGIN{OFS=FS="\t"}NR==1{print $1,$2}NR>1{for(i=2;i<=NF;i++)print $1,$i}' file
ADD REPLYlink written 2.9 years ago by 5heikki8.7k
0
gravatar for Buffo
2.9 years ago by
Buffo1.8k
Buffo1.8k wrote:

this a python script, save it as fast_script.py and run as is described in ussage :)

from __future__ import division
from collections import Counter
import sys

##########################################################################################
syntax = '''
------------------------------------------------------------------------------------

Usage: python fast_script.py table.txt

*Values in table must be separated by tabulation space
*Result is written to stdout
------------------------------------------------------------------------------------
'''
##########################################################################################

if len(sys.argv) != 2:
        print syntax
        sys.exit()

##########################################################################################

in_file = open(sys.argv[1], 'r')
dic_nums = {}
for line in in_file:
        if line.startswith('pos'):
                continue
        else:
                line = line.rstrip('\n')
                col = line.split('\t')
                numb = col[0]
                nucls = col[1:]
                dic_nums[numb] = nucls

for key, value in dic_nums.iteritems():  
        for num in value:
                print str(key) + '\t' + str (num)
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Buffo1.8k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1291 users visited in the last hour