Question: (Closed) How to repeat a row for each value of its corresponding column
0
gravatar for ak93sharma
23 months ago by
ak93sharma10
ak93sharma10 wrote:

hello folks , i am not expert in programming , I want to repeat a row for all its corresponding values any help

my input file is like this

pos COL1    COL2    COL3
18691441    C   A   G
18691572    G   C   G
18691620    A   T   G
18691716    C   G   C

i want output like this

pos COL1    
18691441    C   
18691441    A   
18691441    G   
18691572    G   
18691572    C   
18691572    G   
18691620    A   
18691620    T   
18691620    G   
18691716    C   
18691716    G   
18691716    C

i am trying to repeat a row but it simply makes them duplicate , i am using

while read line; do for i in {1..3}; do echo "$line"; done; done < real2.txt

and gives output:

pos COL1    COL2    COL3
18691441    C   A   G
18691441    C   A   G
18691441    C   A   G
18691572    G   C   G
18691572    G   C   G
18691572    G   C   G
18691620    A   T   G
18691620    A   T   G
18691620    A   T   G
18691716    C   G   C
18691716    C   G   C
18691716    C   G   C

then i extracted pos from input 1.txt file and make 1_pos.txt and write something like this:

 pos
18691441
18691572
18691620
18691716
for i in `cat 1_post.txt`;
do
x=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$2}' ) ;
y=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$3}' ) ;
z=$(grep -i "^$i" 1.txt | awk 'FNR == 1 {print $1"\t"$4}' ) ;
echo -e "$x""\n""$y""\n""$z";
done

this gives me output, without col information:

18691441    C
18691441    A
18691441    G
18691572    G
18691572    C
18691572    G
18691620    A
18691620    T
18691620    G
18691716    C
18691716    G
18691716    C
bash shell awk linux perl • 609 views
ADD COMMENTlink modified 23 months ago by cpad011211k • written 23 months ago by ak93sharma10
1

Hello ak93sharma!

We believe that this post does not fit the main topic of this site.

This is a general programming question. Please search StackOverflow.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink written 23 months ago by RamRS21k

Hi ak93sharma,

This is more of a general programming question, rather than a bioinformatics specific question. In future questions like these should be posted to stack-overflow or similar forums. Regardless, here's a link to a quick Python script to do what you want. Have a good day.

ADD REPLYlink written 23 months ago by James Ashmore2.6k

Thanks @James , but i want columns names too, any help

ADD REPLYlink written 23 months ago by ak93sharma10
1
gravatar for cpad0112
23 months ago by
cpad011211k
India
cpad011211k wrote:

For R users:

data

 pos COL1 COL2 COL3
1 18691441    C    A    G
2 18691572    G    C    G
3 18691620    A    T    G
4 18691716    C    G    C
  1. saved it as TSV
  2. Code in R
library(tidyr)  
library(dplyr) 
test=read.csv("test.csv",sep="\t",stringsAsFactors = F) 
final=arrange(gather(test, "","COL",COL1:COL3)[,c(1,3)],pos)
final
  

Final output:

        pos COL
1  18691441   C
2  18691441   A
3  18691441   G
4  18691572   G
5  18691572   C
6  18691572   G
7  18691620   A
8  18691620   T
9  18691620   G
10 18691716   C
11 18691716   G
12 18691716   C
ADD COMMENTlink modified 23 months ago • written 23 months ago by cpad011211k
0
gravatar for 5heikki
23 months ago by
5heikki8.4k
Finland
5heikki8.4k wrote:
awk 'BEGIN{OFS=FS="\t"}{for(i=2;i<=NF;i++)print $1,$i}' fileWoutHeader
ADD COMMENTlink written 23 months ago by 5heikki8.4k

thanks , can you add columns information too.

ADD REPLYlink written 23 months ago by ak93sharma10
awk 'BEGIN{OFS=FS="\t"}NR==1{print $1,$2}NR>1{for(i=2;i<=NF;i++)print $1,$i}' file
ADD REPLYlink written 23 months ago by 5heikki8.4k
0
gravatar for Buffo
23 months ago by
Buffo1.5k
Buffo1.5k wrote:

this a python script, save it as fast_script.py and run as is described in ussage :)

from __future__ import division
from collections import Counter
import sys

##########################################################################################
syntax = '''
------------------------------------------------------------------------------------

Usage: python fast_script.py table.txt

*Values in table must be separated by tabulation space
*Result is written to stdout
------------------------------------------------------------------------------------
'''
##########################################################################################

if len(sys.argv) != 2:
        print syntax
        sys.exit()

##########################################################################################

in_file = open(sys.argv[1], 'r')
dic_nums = {}
for line in in_file:
        if line.startswith('pos'):
                continue
        else:
                line = line.rstrip('\n')
                col = line.split('\t')
                numb = col[0]
                nucls = col[1:]
                dic_nums[numb] = nucls

for key, value in dic_nums.iteritems():  
        for num in value:
                print str(key) + '\t' + str (num)
ADD COMMENTlink modified 23 months ago • written 23 months ago by Buffo1.5k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1191 users visited in the last hour