Question: How to remove all characters after a specific pattern?
0
gravatar for star
12 days ago by
star190
Netherlands
star190 wrote:

I have a big table like 'df', I would like to remove all value after first ':' for each row.

I have tried :

cat df.bed | cut -f1 -d":" | head and cat df.bed |sed 's/:.*//' | head but they removed all columns after first ':' .

df:

rs1006501   T   A   0/0:14,0:14:42:0,42,596    A    0/0:5,0:5:15:0,15,177
rs1006502   NA  NA  NA                         C,T  ./.
rs1015190   NA  NA  NA                         T    1/1:0,2:2:6:75,6,0
rs10164686  G   A   0/0:1,0:1:3:0,3,46          NA  NA

desired output:

rs1006501   T   A   0/0    A    0/0
rs1006502   NA  NA  NA     C,T  ./.
rs1015190   NA  NA  NA     T    1/1
rs10164686  G   A   0/0   NA    NA
awk rna-seq sed linux • 119 views
ADD COMMENTlink modified 12 days ago by Jeffin Rockey1.1k • written 12 days ago by star190

I'd try to replace all instances of <TAB><SOMETHING_MINIMAL_WITH_NO_TABS>:<SOMETHING_GREEDY_WITH_NO_TABS> with <TAB><SOMETHING_MINIMAL_WITH_NO_TABS>. And would do it in perl rather than sed, because sed is pretty ugly when matching on tabs.

ADD REPLYlink written 12 days ago by russhh4.6k
1
gravatar for finswimmer
12 days ago by
finswimmer12k
Germany
finswimmer12k wrote:

Try this:

sed 's/:[^\t]*//g' df.bed
ADD COMMENTlink modified 12 days ago • written 12 days ago by finswimmer12k
0
gravatar for Jeffin Rockey
12 days ago by
Jeffin Rockey1.1k
Karimannoor
Jeffin Rockey1.1k wrote:
awk -F$'\t' -v OFS="\t" '{split($4,fourth,":");split($6,sixth,":");print $1,$2,$3,fourth[1],$5,sixth[1]}' df.bed
ADD COMMENTlink modified 12 days ago by finswimmer12k • written 12 days ago by Jeffin Rockey1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1443 users visited in the last hour