How to delete spaces in rows from csv file in Python
1
0
Entering edit mode
18 months ago
Paula ▴ 60

Hi! I have a csv file and I need to format it. The input csv file looks like this:

old_name,new_name
"NODE_1_length_592822_cov_338.586386
", SOL_1_3_cov_338.586386_N_1
"NODE_1_length_592822_cov_338.586386
",SOL_1_3_cov_338.586386_N_2

And the final result should look like this:

old_name,new_name
NODE_1_length_592822_cov_338.586386, SOL_1_3_cov_338.586386_N_1
NODE_1_length_592822_cov_338.586386,SOL_1_3_cov_338.586386_N_2

I have tried multiple strategies but none of them has given the desired result:

with open('file.csv','r') as f:
content = f.readlines()
cleaned = ''
for line in content:
    if line != '\n':
        cleaned += line
print(cleaned.replace(" ",""))

Another one

text = open("file.csv", "r", encoding="utf8")
    text = ''.join([i for i in text]) \
        .replace("  ", "")
    x = open("file1" + i + ".csv", "w", encoding="utf8")
    x.writelines(text)
    x.close()

And another one

import csv
with open('file.csv', newline='') as in_file:
with open('new_file.csv', 'w', newline='') as out_file:
    writer = csv.writer('new_file.csv')
    for row in csv.reader('file.csv'):
        if row:
            writer.writerow(row)

Do you have any ideas as to how to solve it?

Thanks!

python csv • 3.8k views
ADD COMMENT
0
Entering edit mode

It would be easier to read the whole file into memory.

fh = open('file.csv')
content = fh.read()
fh.close()

content = content.replace(' ', '').replace('"', '').replace('\n,', ',')    
print(content)

Output:

old_name,new_name
NODE_1_length_592822_cov_338.586386,SOL_1_3_cov_338.586386_N_1
NODE_1_length_592822_cov_338.586386,SOL_1_3_cov_338.586386_N_2
ADD REPLY
0
Entering edit mode

Thanks Andrzej! I'd like to ask you one additional question if you don't mind. If I want to keep the spaces in words in the first column. For example:

I want to obtain

NODE_1_length_592822_cov_338.586386_4 # 1409 # 3598

Instead of:

NODE_1_length_592822_cov_338.586386_4#1409#3598

How can I modify the script?

Thanks a lot!

ADD REPLY
0
Entering edit mode

Could you show me the first few lines of the csv?

ADD REPLY
0
Entering edit mode

Sure!

old_name,new_name
"NODE_1_length_592822_cov_338.586386_1 # 2 # 169 # -1 # 
",SOL_1_3_cov_338.586386_N_1
"NODE_1_length_592822_cov_338.586386_2 # 417 # 695 # 1 # 
",SOL_1_3_cov_338.586386_N_2

Thanks!

ADD REPLY
0
Entering edit mode

Please do not add any " or other exraneous characters to data examples. Use the 101010 button to format the data you want to show in proper format.

ADD REPLY
0
Entering edit mode

Thanks GenoMax!

ADD REPLY
3
Entering edit mode
18 months ago

This should do the job:

oh = open('new_file.csv', 'w')

with open('file.csv') as fh:
    line = fh.readline()
    oh.write(line)
    lst = []
    for i, line in enumerate(fh):
        line = line.strip().lstrip('"').lstrip(',')
        if i % 2:
            lst.append(line)
            oh.write(f'{lst[0]},{lst[1]}\n')
            lst = []
        else:
            lst.append(line)

oh.close()
ADD COMMENT
0
Entering edit mode

Works perfectly, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6