I have a tab-delimited table of of protein ids that looks like that:
45 FBpp0070037 46 FBpp0070039;FBpp0070040 47 FBpp0070041;FBpp0070042;FBpp0070043 48 FBpp0070044;FBpp0110571 ...
For each of these protein Ids I would like to extract the gene id (Fbgn....) in a third column. the output table should looks like that:
45 FBpp0070037 FBgn001234 46 FBpp0070039;FBpp0070040 FBgn00094432;FBgn002345 47 FBpp0070041;FBpp0070042;FBpp0070043 FBgn0001936;FBgn000102;FBgn004527 48 FBpp0070044;FBpp0110571 FBgn0097234;FBgn00183 ...
I was thinking using biomaRt, but I could find a way of automating it for the complete protein ids in the line
I would appreciate your Ideas.