Entering edit mode
1 day ago
城玮
•
0
Hello everyone,
I have run antiSMASH on the whole-genome sequence of a microbial isolate which (based on my lab's biochemical data) produces eicosapentaenoic acid (EPA). I now have a number of predicted biosynthetic gene clusters (BGCs) from antiSMASH, but I'm not sure how to identify which cluster corresponds to EPA biosynthesis, and what further bioinformatics analyses I can perform to strengthen this prediction.
Some details of my current status:
- EPA production was detected non-targetedly (untargeted metabolomics) from the culture supernatant of this strain.
- I have the antiSMASH output: region summaries, BGC types, gene
annotations, “KnownClusterBlast” hits etc.
You could visit this website from the link in here.
- I have not yet identified which BGC is likely responsible for EPA production, nor done downstream network/phylogenetic analysis.
Questions I'd appreciate help with:
- What features in the antiSMASH output should I inspect to select the candidate BGC that likely encodes EPA biosynthesis? For example, domain architecture, PKS/PUFA synthase type clusters, gene synteny, presence of key enzyme types, similarity to known clusters, etc.
- Once a candidate is selected, what bioinformatic analyses would you recommend for further support: e.g., phylogenetic analysis of individual biosynthetic enzyme domains (KS, ACP, MAT), comparative genomics with known EPA gene clusters, transcriptome/RT-qPCR correlation, substrate specificity prediction, metabolic network integration, etc.
- Are there any specific pitfalls or best practices when going from antiSMASH annotation to metabolic product (EPA) assignment, especially for long‐chain polyunsaturated fatty acids?
- Any recommendations of tools, scripts or pipelines to extract the antiSMASH output (e.g., JSON, GenBank files), to parse and tabulate clusters, and to perform downstream analyses.
Thank you in advance for any suggestions or pointers to relevant literature or tutorial resources.
Kind regards,
Zhang Chengwei
I guess it is possible that someone here will know off the top of their head what enzymes are involved in eicosapentaenoic acid biosynthesis, but I consider it unlikely. To find the details about that I think you will need to look through primary literature, as this pathway has been described before in other organisms. For many of the antiSMASH clusters the program should already have a good guess as to what the gene clusters are making, and usually that's not long-chain fatty acids.
As always, a simple Google search reveals half a dozen promising papers just on the first search page.
https://www.google.com/search?q=eicosapentaenoic+acid+biosynthesis
Many thanks, for your valuable guidance. Your suggestion is very helpful, and I will follow your advice by consulting the relevant literature and proceeding with further analysis accordingly.