Question: Replace two or more space by a tab with terminal
2.3 years ago
julien-le.roy wrote:

Hi all, I get an issue when I want to replace multiple space by a tab in a file (in other word convert the file in a table with separator to easily import in excel). Here is the file (Basically, there are 5 columns separated by two or more spaces):

lcl|tetur19g00650  length:863 (mRNA) (E75)  (Ecdysone-induced pro...   132    2e-32
lcl|tetur03g08440  length:544 (mRNA) (HR3)  (Hormone Receptor 3)      85.1    5e-18
lcl|tetur04g01460  length:396 (mRNA) (SVP)  (Seven Up)                66.6    1e-12
lcl|tetur07g00140  length:1063 (mRNA) (HR4)  (Hormone Receptor 4)     63.5    1e-11
lcl|tetur10g04690  length:854 (mRNA) (HR38 (1))  (Hormone Recepto...  61.6    5e-11
lcl|tetur08g06490  length:645 (mRNA) (FTZ-F1)  (Fushi tarazu - Fa...  61.2    6e-11
lcl|tetur01g11050  length:726 (mRNA) (n/a)  (Zinc finger, nuclear...  60.8    9e-11
lcl|tetur01g11040  length:726 (mRNA) (NR2E3)  (Zinc finger, nucle...  60.8    9e-11
lcl|tetur01g09240  length:431 (mRNA) (RXR (2))  (Retinoid X Recep...  60.1    1e-10
lcl|tetur10g04710  length:867 (mRNA) (HR38 (2))  (Hormone Recepto...  59.7    2e-10
lcl|tetur34g00430  length:561 (mRNA) (kni)  (Hypothetical knrl) (...  57.4    1e-09
lcl|tetur03g02550  length:497 (mRNA) (PNR-like)  (Photocell recep...  57.0    1e-09
lcl|tetur31g01930  length:497 (mRNA) (RXR (1))  (Retinoid X Recep...  56.6    1e-09
lcl|tetur05g04280  length:499 (mRNA) (HNF4)  (Hepatocyte Nuclear ...  56.6    2e-09
lcl|tetur01g07700  length:338 (mRNA) (HR83-like)  (Possibly HR83-...  52.0    4e-08
lcl|tetur01g02690  length:576 (mRNA) (dissatisfaction)  (dissatis...  51.6    6e-08
lcl|tetur08g01210  length:249 (mRNA) (Tll)  (Tailless)                51.2    7e-08
lcl|tetur11g04570  length:901 (mRNA) (HR39)  (Hormone receptor-li...  47.4    9e-07
lcl|tetur28g00490  length:370 (mRNA) (ERR)  (Estrogen-related Rec...  47.0    1e-06
lcl|tetur11g01960  length:567 (mRNA) (HR96-like g)  (HR96-like nu...  45.8    2e-06
lcl|tetur01g15140  length:430 (mRNA) (EcR)  (Ecdysone Receptor)       44.7    6e-06
lcl|tetur01g07820  length:564 (mRNA) (HR96-like d)  (HR96-like nu...  42.7    2e-05
lcl|tetur34g00750  length:579 (mRNA) (HR96-like a)  (HR96-like nu...  42.7    3e-05
lcl|tetur30g01210  length:669 (mRNA) (HR96-like b)  (HR96-like nu...  39.3    2e-04
lcl|tetur36g00260  length:501 (mRNA) (HR96-like h)  (HR96-like nu...  39.3    3e-04
lcl|tetur20g01820  length:499 (mRNA) (HR96-like e)  (HR96-like nu...  38.9    3e-04
lcl|tetur04g03100  length:490 (mRNA) (HR96-like f)  (HR96-like nu...  38.9    3e-04
lcl|tetur17g03630  length:483 (mRNA) (HR96-like c)  (HR96-like nu...  38.5    4e-04
lcl|tetur07g04810  length:311 (mRNA) (E78)  (Ecdysone-induced pro...  33.9    0.012

I use this command line but nothing happens (I got a new file but with the same space-separator):

sed 's/ \+ /\t/g' inputfile > outputfile

Do you have some idea? Thank you very much!

I added (code) markup to your post for increased readability.

Thanks! It's much better!

for better readability , use sed 's/\s\s\+/\t/g' input (for two or more spaces). However in one of the columns, i see text being separated by spaces. Make sure that you have uniform space between columns, not within column.

Thank you for your reply, actually when there are 2 or more space I would like to replace them by a tab. But when there is only 1 space let it like that. I hope I'm clear

2.3 years ago
Kevin Blighe
Kevin Blighe wrote:

You're running that sed command incorrectly for what you want to do. It should just be:

sed 's/ \+/\t/g' inputfile > outputfile
Thank you, It still not working I don't know why... I'm working on mac but i'm not sure this is an issue?

On Mac, it would just be this:

sed -E $'s/[[:blank:]]+/\t/g' inputfile

However, I see your issue. Even the spaces in the gene descriptions will change.

Are you sure that using a rule whereby only 2 or more spaces combined are changed is valid?

