I try to learn by code how to merge tables, I have 1.csv, 2 csv
PART OF 1.CSV:
GENE1 MAP DESCRIPTION
5-HT3C2 3q27.1 5-hydroxytryptamine receptor 3E pseudogene
A1BG 19q13.43 alpha-1-B glycoprotein
A1BG-AS1 19q13.43 A1BG antisense RNA 1
A1CF 10q11.23 APOBEC1 complementation factor
A2M 12p13.31 alpha-2-macroglobulin
A2M-AS1 12p13.31 A2M antisense RNA 1 (head to head)
A2ML1 12p13.31 alpha-2-macroglobulin like 1
Part of 2.csv
GENE CHROM POS
AADACL2-AS1 chr3 151545320
AADAT chr4 170999757
ABCA10 chr17 67178785
ABCA12 chr2 215884092
ABCA2 chr9 139904734
ABCA4 chr1 94528499
ABCA5 chr17 67304275
ABCA6 chr17 67096772
I try to get a csv file like
GENE CHROM POS MAP DESCRIPTION where common genes are merged
I use this code:
import csv
diseases = {}
# Load the disease file in memory
with open('1.csv', 'rb') as dfile:
reader = csv.reader(dfile)
next(reader, None)
# Skip the header
dfile.next()
for GENE1, MAP, DESCRIPTION in dfile:
diseases[GENE1] = (MAP, DESCRIPTION)
with open('2.csv', 'rb') as idfile:
reader = csv.reader(idfile)
next(reader, None)
# Skip the header
idfile.next()
for GENE, CHROM, POS in idfile:
if GENE in diseases:
output.writerow((GENE, CHROM, POS) + diseases[GENE1])
But I get this error message
python TEST.py
Traceback (most recent call last):
File "TEST.py", line 11, in <module>
for GENE1, MAP, DESCRIPTION in dfile:
ValueError: too many values to unpack
What I do wrong?...Im not a programmer so is pretty hard for me try to learn all of this. If is another easy way to do...
Thanks!!!
The lines/rows in dfile are not what you think they are. Try to just print them out to get a better idea:
ok, let me see if i do
I get my three rows like in csv file
What's the len() of
row
? I'm not sure if you should explicitly set the field delimiter for csv.reader()e.g.
ok, now I think understand what you mean, is because I have like this:
ZSCAN30 18q12.2 zinc finger and SCAN domain containing 30
So the csv takes as 9 rows instead 3 right? The strange thing is on excel I see it just as three rows...
I converted my files from xls renamed to csv...probably I did wrong right?
The lenght of each row is about 59.000 fields
Thanks for the help!
You are mixing up rows and columns.
It can be that csv.reader() takes both spaces and tabs as delimiter, while Excel only took the tabs. If you set the delimiter explicitly, this might be solved. But having no access to your data I can only guess and suggest you what to look at.
you are right I mess up words... ^^"
Can I send you the two files? if you can help me I would really appreciate, I just want to learn how to merge files, so I can work easily adding information on my csv files.
Have you tried setting the delimiter in csv reader explicitly to tab?
What was the result of my code in C: How to merge two csv tables ?
Got this...no idea what means I m googeling
That means your code block indentation is not right, i.e. that your probably have this:
instead of this: