I have five BLASTn tabular files that resulted from querying the same large gene list (same query) against a different subject genome database for each resulting file. The goal was to potentially identify possible orthologous sequences between the subject gene list and the 5 different match genomes.
I was able to identify most, if not all of the same genes between each genome.
Now I would like to concatenate the five files together and sort them by the gene identifier name so that the sequences and names for the same gene across all five genomes are located in the same row of a different column. THe goal with this is that I can then extract the sequences for all five species across every gene for an alignment. I am working with a huge amount of genes here.
Is there a way to do this using cat and sort or would python work better? I am a bit clueless as to how to do this in python.
Thanks in advance, Zach