I have 2 files which I need to parse and build a matrix out of them:
The files are as follows:
NC_000964.parsed NC_002570.parsed NC_003909.parsed NC_003997.parsed NC_004721.parsed NC_005945.parsed NC_005957.parsed NC_006274.parsed NC_006322.parsed NC_006510.parsed NC_006582.parsed .. ..
A file of my cleaned outputs from analysis. All in same directory (in this files are genes for certain species in combinations of blast outfile. i.e. in format (one line from file2 \t another line form file2)## if gene in file 2 aligned with gene in file1
gi|56418536|ref|YP_145854.1 gi|56418537|ref|YP_145855.1 gi|56418538|ref|YP_145856.1 gi|56418539|ref|YP_145857.1 gi|56418540|ref|YP_145858.1 gi|56418541|ref|YP_145859.1 .. ..
A file of genes from some species in experiment. has more that 4000 genes.
I want to make a matrix in the sense that the 1st column is file 1 and the first row is file 2
Then I will open the files in one to compare with the list in file2. if matched, the coordinates in the matrix will fill with  else . that will give me an absence presence matrix for my list in file2 against outputs in file1.
Urgent help needed since this makes a basis of my next move.
my script so far
#!/usr/bin/env python import os,sys,re path = "./xxxxxx" mylist= open('file1.txt','r') mychecklist = open('file2.txt','r') for line in mychecklist:#list of resistant genes line=line.strip() mybk.append(line) # array of file2 for line in mylist:# list of parsed files from blast output line=line.strip() listbk.append(line)# array if file1 for I in listbk:# open parsed files to read and analyze content file = os.path.join(path,i) files.append(file) have all files I text= open(files ,'r') for line in text: ### stuck...since all lines from files1 read to same file
Can you put your codes in the codeblock? So it will be easy to debug