hi! I'm fairly new at all this and have been trying to filter a fasta file based on blast results file and I'm having a lot of trouble doing so, as I don't really understand how to code this. i have file 1 in this format:
>SampleName11.16891;size=8308
GATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGC
GCCCCTTGGTATTCCGGGGGGCATGCCTGTTCGAGCGTCATTACAACCCTCAAGCTCTGCTTGGAATTGGGCACCGTCCT
CACTGCGGACGCGCCTCAAAGACCTCGGCGGTGGCTGTTCAGCCCTCAAGCGTAGTAGAATACACCTCGCTTTGGAGCGG
TTGGCGTCGCCCGCCGGACGAACCTTCTGAACTTTTCTCAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTA
AGCATATCAATAAGCGGAGGAAAAGAAACCAACAGGGATTGCCTTAGTAACGGCGAGTGAAGCGGCAACAG
>SampleName10.739;size=7844
GATGAAGAACGTAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGC
GCCCCTTGGTATTCCGGGGGGCATGCCTGTTCGAGCGTCATTACAACCCTCAAGCTCTGCTTGGAATTGGGCACCGTCCT
CACTGCGGACGCGCCTCAAAGACCTCGGCGGTGGCTGTTCAGCCCTCAAGCGTAGTAGAATACACCTCGCTTTGGAGCGG
TTGGCGTCGCCCGCCGGACGAACCTTCTGAACTTTTCTCAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTA
AGCATATCAATAAGCGGAGGAAAAGAAACCAACAGGGATTGCCTTAGTAACGGCGAGTGAAGCGGCAACAG
...
And then file 2 in this:
SampleName11.4501;size=7446 Fungi_sp|MT237019|SH3568601.08FU|reps_singleton|k__Fungi;p__unidentified;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Fungi_sp    80.7
SampleName11.1591;size=7622 Lasiodiplodia_citricola|GU945354|SH2131013.08FU|refs|k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Botryosphaeriales;f__Botryosphaeriaceae;g__Lasiodiplodia;s__Lasiodiplodia_citricola   100.0
SampleName11.1591;size=7622 Fungi_sp|MT237019|SH3568601.08FU|reps_singleton|k__Fungi;p__unidentified;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Fungi_sp    81.0
SampleName10.611;size=6979  Lasiodiplodia_citricola|GU945354|SH2131013.08FU|refs|
And I would like to filter file 1 to only have the samples (sequence and name) that appear on file 2, do any of you could give me some lights on how to do that? Thank you
thank you very much, I'm gonna try that!
Update: That worked very well. Thank you so much :)