Question: How to merge two files genotype and ped In Linux? I sample files as follows.
0
gravatar for mm
2.4 years ago by
mm20
mm20 wrote:

How to merge two files genotype and pre-ped In Linux? I sample files as follows.

S949C08 111071 900533 900409 Susceptible 2
S949G08 111064 900533 900469 Susceptible 2
S949E09 111051 910054 890231 Susceptible 2
S949209 111049 910054 910087 Susceptible 2
R949C06 111034 920283 920207 Susceptible 1

genotype file: One example of an animal's genotype

R949C06 TC TT CC TC CC TT GG CC AG TT AA GG AA TT CC TC -- CC TC GC TC AA TC AG AG TC AA AA AA AG TC CC AT AA TT AA TT GG AA TC AG TC TA TA AG -- TG TT -- AA -- TT TT CC AG GG TC GG CC AA -- CC AC AA GG -- AA CC CC AA TC AG AA TC CG TT GG CC TT GG AG GG TT AA CC AA CC TC AG GG TC AG AG AG GC CC AG GG AA TC GG AA AA GG TC AG CC AG CC TC AA CC CC CC GG CC AG CC CC AG AC CC GG TT CC AG CC AA TC TT GG AG GG CC TC TC AA GG CC TC AG AG TT GG TG AG AA TG
linux plink • 979 views
ADD COMMENTlink modified 15 months ago by mittu1602150 • written 2.4 years ago by mm20
1

man join, noting that you need to sort the files before use.

ADD REPLYlink written 2.4 years ago by Devon Ryan89k

I do not understand what are you saying?

ADD REPLYlink written 2.4 years ago by mm20
1

Devon asks you to read the manual for the linux command 'join'. https://linux.die.net/man/1/join

"join - join lines of two files on a common field "

noting that you need to sort the files before use.

"Important: FILE1 and FILE2 must be sorted on the join fields. "

ADD REPLYlink written 2.4 years ago by Pierre Lindenbaum119k

I SNP Chip id ,common in both files. That is done I want to join

ADD REPLYlink written 2.4 years ago by mm20
0
gravatar for mittu1602
15 months ago by
mittu1602150
India
mittu1602150 wrote:

You can try awk one-liner

cat Test1.txt

S949C08 111071 900533 900409 Susceptible 2
S949G08 111064 900533 900469 Susceptible 2
S949E09 111051 910054 890231 Susceptible 2
S949209 111049 910054 910087 Susceptible 2
R949C06 111034 920283 920207 Susceptible 1

cat Test2.txt

R949C06 TC TT CC TC CC TT GG CC AG TT AA GG AA TT CC TC -- CC TC GC TC AA TC AG AG TC AA AA AA AG TC CC AT AA TT AA TT GG AA TC AG TC TA TA AG -- TG TT -- AA -- TT TT CC AG GG TC GG CC AA -- CC AC AA GG -- AA CC CC AA TC AG AA TC CG TT GG CC TT GG AG GG TT AA CC AA CC TC AG GG TC AG AG AG GC CC AG GG AA TC GG AA AA GG TC AG CC AG CC TC AA CC CC CC GG CC AG CC CC AG AC CC GG TT CC AG CC AA TC TT GG AG GG CC TC TC AA GG CC TC AG AG TT GG TG AG AA TG

awk 'FNR==NR{a[$1]=$2 FS $3 FS $4 FS $5 FS $6;next}{ print $0, a[$1]}' Test1.txt Test2.txt > result.txt

cat result.txt

R949C06 TC TT CC TC CC TT GG CC AG TT AA GG AA TT CC TC -- CC TC GC TC AA TC AG AG TC AA AA AA AG TC CC AT AA TT AA TT GG AA TC AG TC TA TA AG -- TG TT -- AA -- TT TT CC AG GG TC GG CC AA -- CC AC AA GG -- AA CC CC AA TC AG AA TC CG TT GG CC TT GG AG GG TT AA CC AA CC TC AG GG TC AG AG AG GC CC AG GG AA TC GG AA AA GG TC AG CC AG CC TC AA CC CC CC GG CC AG CC CC AG AC CC GG TT CC AG CC AA TC TT GG AG GG CC TC TC AA GG CC TC AG AG TT GG TG AG AA TG 111034 920283 920207 Susceptible 1
ADD COMMENTlink written 15 months ago by mittu1602150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 776 users visited in the last hour