Hello! I'm building a database of a certain gene family. I downloaded the fastas from uniprot , concatenated the resulting fastas using `cat` and the fasta headers of each sequence have the following...that the gene information (the `GN=` part) is the first string after the first pipe sign (|) on each fasta header. is there a way to do that using awk or R string manipulation?
I want that all my…