Question: sequence extract problem
0
gravatar for mxlsherry1992
15 months ago by
mxlsherry199240 wrote:

Dear all, I have some ID in file1, and I want to extract its' corresponding line from file 2, but the ID in these two file is not complet match, if you know there is anyway I could use a command line for that?

I got two command line here but it seems doestn't work.

grep -Fwf file1.txt file2.txt > results

awk 'NR==FNR{x[$0];next}{for(i in x)if($0~i)print}' file1.txt file2.txt

Here is ID from file 1:

TRINITY_DN100263_c0_g1_i13
TRINITY_DN100263_c1_g1_i1
TRINITY_DN100330_c0_g1_i1
TRINITY_DN100330_c0_g2_i14
TRINITY_DN100529_c0_g1_i3
TRINITY_DN100620_c0_g1_i2

Here is file 2:

TRINITY_DN132010_c5_g4  0   0   0   0   0.18    0.93    0.67    0.61    0   0.45    00.25   0   0
TRINITY_DN100263_c1_g1  0.08    0.06    0.06    0.09    0.1 0.07    0.43    0.2 0.16    0.36    0.06    0.42    0   0
TRINITY_DN50647_c0_g1   0   0   0   0.9 0   0   0   0   0   0   00
TRINITY_DN100330_c0_g2  0   0   0   0   0   0   0   0   0   0   01.06   0   0
TRINITY_DN137407_c4_g1  0   0   0.19    0   0   0   0.17    0.15    0   0.12    0.
rna-seq • 266 views
ADD COMMENTlink modified 15 months ago by Pierre Lindenbaum133k • written 15 months ago by mxlsherry199240
0
gravatar for JC
15 months ago by
JC12k
Mexico
JC12k wrote:

You need to remove the non-matching part of the first part before doing your search, for example:

perl -pe "s/_i\d+//" < file1 > file1_mod

then you can search with grep or awk or perl.

ADD COMMENTlink written 15 months ago by JC12k
0
gravatar for Pierre Lindenbaum
15 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:
join -t $'\t' -1 1 -2 1 <(sort -t $'\t' -k1,1 file1.txt) <(sort -t $'\t' -k1,1 file2.txt) > results
ADD COMMENTlink written 15 months ago by Pierre Lindenbaum133k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1015 users visited in the last hour
_