Hello everyone, I am a beginner in the field of bioinformatics and my questions will be very basic.sorry about that. I have run trinity on RNA sample library got the trinity.fasta file. next i run blasx and got output.tx file and extract seq id .got ID LIST AND RUN FASTACMD and got ID-list.fasta file.This file has a long list of >gi number, protein, and various Viruses. i have got 160 >gi with different virus (some viruses correspond to 20 >gi or even greater. My Questions are 1) which command in Linux can be used to organize these viruses according to their >gi number. 2) how to extract contigs from these viruses >gi number.,as i want to run blastn keeping the Trinity.fasta as reference sequence contigs. Thanks
Please post example input lines and expected output. It would be helpful to address OP issues.