Hi all,
Sorry to bother you all again. so I have a text file which contains the PDBID and corresponding missing coordinates from PDB file. Such as:
1FZ2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZH 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
and I have another text file which contains the PDBID and SEG signal (which is the signal indicates to low complexity region in protein sequence). Such as:
1FZ2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 1FZ4 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 1FZ5 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 1FZ8 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 1FZ9 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 1FZH 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354
The numbers in each files are coordinates. so I want to compare those two files and generate a file which contains PDBID or course and corresponding overlap coordinates between SEG signal and missing coordinates.
In this case I want to generate a file like:
1FZ2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ4 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ5 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ8 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZ9 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1FZH 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
I have my python code so far:
total = [] fin = open('file1.txt') # I want to make the missing coordinates file a set called 'a' for lines in fin: l = lines.split() a = set(l[2:]) print a with open('file2.txt') as seg_num: # I want to make the SEG signal another set called 'b' for seg_signal in seg_num: signal = seg_signal.split() b = set(signal[1:]) print("lol" * 10) print b c = a & b # and pick the intersection between a and b called c space = ' ' newlines = '\n' total.append([signal[0], space, str(c), newlines]) with open('file3.txt', 'w') as f: for t in total: f.write(" ".join(t)) f.close()
But for some reason it did not give the desire answer. And I don't know how to fix it.