grep multiple lines form file
0
0
Entering edit mode
4.0 years ago
harry ▴ 30

I have one output.txt file in which there are 20000 strings. see below it looks like that. It is genome coordinate file.

X:10063445|10067490
X:10063445|10079804
X:10063445|10098579
X:10063445|10110020
X:100914621|100915083
X:101018971|101019403
X:101018971|101020588
X:101020487|101021315
X:101020487|101023616
X:101035613|101038051
X:101035613|101041371
X:101035613|101042312
X:101041317|101042312
X:101101058|101109998
X:101101058|101148161
X:101120402|101120784
X:101126709|101148161
X:101895814|101898042

I want to grep these string in my result.txt file which have 10 column text file and the 1st column is look like output.txt file. I just want output of those strings are present or not in my result.txt file and save in another test file. I also want to extract total 1st 5 column which are present in result.txt file.

Thank you so much in advance.

shell-script grep • 1.1k views
ADD COMMENT
0
Entering edit mode

You can work with bed files and use bedtools intersect. You can do some grep and awk magic (how do you define intersection? Do you want exactly the same coordinates?) but it will probably be messy.

ADD REPLY
0
Entering edit mode

i just want there is any way to include both intersect and not intersect strings in my output intersect.bed file.

ADD REPLY
0
Entering edit mode

Please add representative output.

ADD REPLY
0
Entering edit mode

and few lines from both the input files.

ADD REPLY
0
Entering edit mode

here is 1st input -

X   10063445    10098579    X:10063445|10098579 
X   101020487   101021315   X:101020487|101021315   
X   101041317   101042312   X:101041317|101042312   
X   101120402   101120784   X:101120402|101120784   
X   101126709   101148161   X:101126709|101148161   
X   107088436   107088839   X:107088436|107088839   
X   110020352   110067396   X:110020352|110067396

2nd input file-

X   10063445    10098579    2
X   11055936    11110981    2
X   13666317    13680598    5
X   14843660    14859334    13
X   14850505    14859334    5
X   16818574    16829770    2
X   19541925    19546050    4
X   19683823    19695741    4
X   19965044    19970298    2
X   20188497    20204103    2
X   24073601    24074959    11
X   24172715    24179770    9
X   24179183    24179770    2
X   24540246    24546477    2
X   24809898    24843677    4
X   24809898    24888122    3
X   38666121    38687674    2
X   44524002    44527365    8
X   45010961    45020730    3
X   45010961    45037689    2
X   46984884    46998277    2
X   47222261    47228644    2

till now i used bedtools intersect to intersect of both file but it give result only of intersect and i also want which are not intersect also in the same result file. i use command--

bedtools intersect -wa -wb -a input1 -b input2 -f 1 -r >intersect.bed

So is there any way to include result of both intersect and not intersect in same intersect.bed file like this i want my result -

-
X   10063445    10098579    X:10063445|10098579         X   10063445    10098579    2
X   101020487   101021315   X:101020487|101021315   
X   101041317   101042312   X:101041317|101042312   X   101041317   101042312   3
X   101120402   101120784   X:101120402|101120784   
X   101126709   101148161   X:101126709|101148161   X   101126709   101148161   4
X   107088436   107088839   X:107088436|107088839   X   107088436   107088839   4
X   110020352   110067396   X:110020352|110067396   
X   110020352   110109146   X:110020352|110109146   X   110020352   110109146   3
X   110067347   110109146   X:110067347|110109146   X   110067347   110109146   4
X   11055936    11110981            X:11055936|11110981

so here i expected output result like this which include both intersect and not intersect . thanks

ADD REPLY
0
Entering edit mode

I lost you. If you want the intersecting and non-intersecting, isn't it all of the input file?

ADD REPLY
0
Entering edit mode

no it contains some data of input file. i don't upload whole input . so can you suggest me how do i get both intersect and non-intersect in same file.

ADD REPLY
0
Entering edit mode

Take a look at this SO post, I think it answers your question: https://stackoverflow.com/questions/2619562/joining-multiple-fields-in-text-files-on-unix

ADD REPLY
0
Entering edit mode

Query is confusing. Can you post a small worked out example? @ harry

ADD REPLY

Login before adding your answer.

Traffic: 1357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6