manipulation of text by sed command
1
0
Entering edit mode
3.9 years ago

Hi, I a file containing the genome ids following NZ_FLAT01000030.1_173 I need to manipulate those ids like this one: NZ_FLAT01000030.1

How can I do that by using sed command?

shell • 1.2k views
2
Entering edit mode

What have you tried? This is a good place to enhance your solve-by-google skills, and if you need help with that, I can walk you through it.

0
Entering edit mode

I tried sed 's/_/\t/' . But the output is
NZ FLAT01000030.1_173 I need NZ_FLAT01000030.1

0
Entering edit mode
3.9 years ago
Ram 36k

Your command substitutes (s) an underscore _ with a tab \t. What you want is everything up to the last underscore scanning from the end to be replaced with nothing (which is the same as being removed).

So, you're looking to replace an underscore _ followed by any character until the end of the word .+\b. However, you must ensure the regex does not greedily pick everything starting from the first underscore. So, the second regex (.+\b) is better off ensuring the characters being matched are not underscores, which means the . can be replaced by a [^_] to make the regex [^_]+\b.

Combine those two, and you get _[^_]+\b. Use sed -r to replace that with nothing. Something like sed -r 's/<pattern>// or sed -r s/<pattern>//g should do it.

How to get to this by Google: Search for sed replace characters upto word boundary. Experiment with sed a bit and you'll get there.

0
Entering edit mode

Thanks. But, it couldn't edit my file as desired

sed -r 's/_//' output: NZFLAT01000030.1_173
sed -r 's/_//g' output: NZFLAT01000030.1173

0
Entering edit mode

This work for me

sed -E 's/_[0-9]+//g'

0
Entering edit mode

That is BSD sed. GNU sed has the -r option. If you're using a Mac, I highly recommend you switch to GNU coreutils so your scripts can run across linux distributions. Also, BSD sed sucks.

0
Entering edit mode

You're missing the word boundary \b, which is why your g-modified sed is removing all _s. Your command is completely unlike the one I showed you, so I cannot help you.