Presence absence matrix from blast results
0
1
Entering edit mode
2.5 years ago

I have a many blast output files of genome names, which looks like this. blast.jpg

In the first column of the file, it contains all the identified query UIDs, I want to make a presence-absence matrix in csv format in which a column would contain all the blast output filenames and row would contain UIDs. If the blast file contain any UID that should be marked as 1 and if it's not present it would be marked as 0. For less files this can be done manually, but for large number of files, I want a python script which can run through all the files and make a csv file like mentioned. Please help me in this.

blast python genome biopython • 875 views
ADD COMMENT
0
Entering edit mode

What have you tried so far? Provide some code from which we can start.

ADD REPLY
0
Entering edit mode

And in addition to anything you've tried, provide here (or post the text at a code snippet-posting service such http.//gist.github.com and post the URL here) the text version of the lines in the picture you posted. Sharing a picture of many columns of a text file isn't really a productive way to share an example of input.

ADD REPLY

Login before adding your answer.

Traffic: 3182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6