Question: How to remove all characters after a specific pattern?
0
gravatar for star
9 months ago by
star230
Netherlands
star230 wrote:

I have a big table like 'df', I would like to remove all value after first ':' for each row.

I have tried :

cat df.bed | cut -f1 -d":" | head and cat df.bed |sed 's/:.*//' | head but they removed all columns after first ':' .

df:

rs1006501   T   A   0/0:14,0:14:42:0,42,596    A    0/0:5,0:5:15:0,15,177
rs1006502   NA  NA  NA                         C,T  ./.
rs1015190   NA  NA  NA                         T    1/1:0,2:2:6:75,6,0
rs10164686  G   A   0/0:1,0:1:3:0,3,46          NA  NA

desired output:

rs1006501   T   A   0/0    A    0/0
rs1006502   NA  NA  NA     C,T  ./.
rs1015190   NA  NA  NA     T    1/1
rs10164686  G   A   0/0   NA    NA
awk rna-seq sed linux • 279 views
ADD COMMENTlink modified 9 months ago by Jeffin Rockey1.1k • written 9 months ago by star230

I'd try to replace all instances of <TAB><SOMETHING_MINIMAL_WITH_NO_TABS>:<SOMETHING_GREEDY_WITH_NO_TABS> with <TAB><SOMETHING_MINIMAL_WITH_NO_TABS>. And would do it in perl rather than sed, because sed is pretty ugly when matching on tabs.

ADD REPLYlink written 9 months ago by russhh5.4k
1
gravatar for finswimmer
9 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Try this:

sed 's/:[^\t]*//g' df.bed
ADD COMMENTlink modified 9 months ago • written 9 months ago by finswimmer13k
0
gravatar for Jeffin Rockey
9 months ago by
Jeffin Rockey1.1k
Karimannoor
Jeffin Rockey1.1k wrote:
awk -F$'\t' -v OFS="\t" '{split($4,fourth,":");split($6,sixth,":");print $1,$2,$3,fourth[1],$5,sixth[1]}' df.bed
ADD COMMENTlink modified 9 months ago by finswimmer13k • written 9 months ago by Jeffin Rockey1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 823 users visited in the last hour