Question: compare two text file
0
gravatar for Sam
3.1 years ago by
Sam140
Sam140 wrote:

Hello all , Could you please help me about this? I have 2 tab delimited file, I want to compare them to find common Id between 2 file which has one or all of these report in 7 column of the second file “Luciferase reporter assay//qRT-PCR//Western blot” For instance :

1st file 
hsa-miR-654-5p
hsa-miR-182-5p

 2nd file

MIRT733442  hsa-miR-654-5p      Homo sapiens    RPS6KB1 6198    Homo sapiens    PAR-CLIP
MIRT733429  hsa-miR-654-5p      Homo sapiens    EPSTI1  94240   Homo sapiens    Luciferase reporter assay//qRT-PCR//Western blot

Out put 
MIRT733429  hsa-miR-654-5p      Homo sapiens    EPSTI1  94240   Homo sapiens    Luciferase reporter assay//qRT-PCR//Western blot

Thanks in advance

bash awk • 1.0k views
ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Sam140
2

in bash shell with grep:

 $ grep -w "blot$" test2.txt  | grep -f test1.txt

test1.txt - 1st file and test2.txt- 2nd file in OP and assuming that blot is common in all lines ending up with western blot and no other line has word blot in it.

with awk:

 $ awk 'FNR==NR {a[$1]; next } ($2 in a && /'blot$'/)' test1.txt test2.txt

with join:

 $ sed -n /blot$/p test2.txt |  join -1 1 -2 2  test1.txt - --nocheck-order
ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by cpad011214k
3
grep -f test1.txt test2.txt | grep "Luciferase reporter assay\|qRT-PCR\|Western blot"
ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by lessismore930

didn't see the second requirement in OP.

ADD REPLYlink written 3.1 years ago by cpad011214k

Thanks lessismore, you are a real biostars :)

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Sam140

In R

test <- merge(1st file , 2nd file, by.x="column_1stfile", by.y="column_2ndfile", all.x=T)
ADD REPLYlink written 3.1 years ago by lessismore930

2nd file is a huge file and with grep -f test1.txt test2.txt > out.put , OP contain all data of test2.txt NOT common data of both files, how I can solve this problem ? for instance , out put file contain hsa-miR-99 but it's not available in test1.txt file

OP:

MIRT027394,hsa-miR-99a-5p,Homo sapiens,AGO2,27161,Homo sapiens,Sequencing,Functional MTI (Weak),20371350
MIRT027394,hsa-miR-99a-5p,Homo sapiens,AGO2,27161,Homo sapiens,Luciferase reporter assay//qRT-PCR//Western blot,Functional MTI,24732044
MIRT027395,hsa-miR-99a-5p,Homo sapiens,MEF2D,4209,Homo sapiens,Sequencing,Functional MTI (Weak),20371350
MIRT027396,hsa-miR-99a-5p,Homo sapiens,SKI,6497,Homo sapiens,Sequencing,Functional MTI (Weak),20371350
MIRT027397,hsa-miR-99a-5p,Homo sapiens,COQ2,27235,Homo sapiens,Sequencing,Functional MTI (Weak),20371350
MIRT027398,hsa-miR-99a-5p,Homo sapiens,TRIB1,10221,Homo sapiens,Sequencing,Functional MTI (Weak),20371350
MIRT027398,hsa-miR-99a-5p,Homo sapiens,TRIB1,10221,Homo sapiens,PAR-CLIP,Functional MTI (Weak),26701625
ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Sam140

I already also tried with this command

grep "Luciferase reporter assay\|qRT-PCR\|Western blot" text2.txt | grep -f test.1.text > 3

but again out put file contain hsa-miR-99 !

ADD REPLYlink written 3.1 years ago by Sam140
0
gravatar for Pierre Lindenbaum
3.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

find common Id between 2 file

it's a job for comm https://linux.die.net/man/1/comm

comm -12 <(sort -t $'\t' file1.txt | sort | uniq)  <(cut -f 2  file2.txt | sort -t $'\t' | sort | uniq)
ADD COMMENTlink written 3.1 years ago by Pierre Lindenbaum131k

Thanks for your comment but I want to have an output if it contain one or all of this strings “Luciferase reporter assay//qRT-PCR//Western blot” NOT just common ID

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Sam140

then it's join : follow genomax 's link.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Pierre Lindenbaum131k

Here the link to that thread: How to retrieve rows from OTU table

ADD REPLYlink written 3.1 years ago by genomax92k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1853 users visited in the last hour