magpurify errors
1
0
Entering edit mode
4 hours ago
shevch2009 ▴ 20

Hello all! I am tring to clean obtained mags with the magpurify. For some mags it worked ok, but for some I keep getting error for phylo-markers module:

magpurify phylo-markers /path/bin.5.fa /path/magpurify_results --threads 16
Calling genes with Prodigal
 all genes: /path/magpurify_results/phylo-markers/genes.[ffn|faa]

Identifying PhyEco phylogenetic marker genes with HMMER
  mm results: /path/magpurify_results/phylo-markers/phyeco.hmmsearch
  marker genes: /path/magpurify_results/phylo-markers/markers

Performing pairwise BLAST alignment of marker genes against database
 blast results: /path/magpurify_results/phylo-markers/alns

Finding taxonomic outliers
Traceback (most recent call last):
  File "/path/miniconda3/bin/magpurify", line 10, in <module>
    sys.exit(cli())
             ^^^^^
  File "/path/miniconda3/lib/python3.12/site-packages/magpurify/cli.py", line 116, in cli
    args["func"](args)
  File "/path/miniconda3/lib/python3.12/site-packages/magpurify/modules/phylo.py", line 419, in main
    flagged = flag_contigs(args["db"], args["tmp_dir"], args)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/miniconda3/lib/python3.12/site-packages/magpurify/modules/phylo.py", line 372, in flag_contigs
    bin.genes[aln["qname"]].annotations.append(annotation)
    ~~~~~~~~~^^^^^^^^^^^^^^
KeyError: 'k127_4584534_11'

So, this - KeyError: 'k127_4584534_11' is always different for different bin, is just name of contig that tool don't recognize?

And when I try use gc-content, it also will give me error for the same bin, that gave error with the phylo-markers

magpurify gc-content /path/bin.5.fa /path/magpurify_results
Computing mean contig GC content
Traceback (most recent call last):
  File "/path/miniconda3/bin/magpurify", line 10, in <module>
    sys.exit(cli())
             ^^^^^
  File "/path/miniconda3/lib/python3.12/site-packages/magpurify/cli.py", line 116, in cli
    args["func"](args)
  File "/path/miniconda3/lib/python3.12/site-packages/magpurify/modules/gc.py", line 68, in main
    contig.gc = round(SeqUtils.GC(seq), 2)
                      ^^^^^^^^^^^
AttributeError: module 'Bio.SeqUtils' has no attribute 'GC'

If I run modules tetra-freq, clade-markers and known-contam, it will work fine

magpurify clade-markers /path/bin.5.fa /path/magpurify_results --threads 16
Reading database info

Calling genes with Prodigal
  all genes: /path/magpurify_results/clade-markers/genes.[ffn|faa]

Performing pairwise alignment of genes against MetaPhlan2 database of clade-specific genes
  alignments: /path/magpurify_results/clade-markers/genes.m8

Finding top hits to database
  2118 genes with a database hit

Classifying genes at each taxonomic rank
  kingdom: 104 classified genes
  phylum: 0 classified genes
  class: 0 classified genes
  order: 0 classified genes
  family: 0 classified genes
  genus: 0 classified genes
  species: 0 classified genes

Taxonomically classifying contigs
  total contigs: 995
  kingdom: 94 classified contigs
  phylum: 0 classified contigs
  class: 0 classified contigs
  order: 0 classified contigs
  family: 0 classified contigs
  genus: 0 classified contigs
  species: 0 classified contigs

Taxonomically classifying genome
  consensus taxon: None

Identifying taxonomically discordant contigs
  0 flagged contigs: /path/magpurify_results/clade-markers/flagged_contigs

Has anyone encountered something like this? Are there any solutions?

I would appreciate any suggestions.

Thanks, Alla

magpurify data shotgun • 412 views
ADD COMMENT
0
Entering edit mode
3 hours ago
Mensur Dlakic ★ 30k

A general answer is to always Google the error message. Hopefully you do that in the future. This is the outcome of Googling AttributeError: module 'Bio.SeqUtils' has no attribute 'GC':

https://www.google.com/search?q=AttributeError%3A+module+%27Bio.SeqUtils%27+has+no+attribute+%27GC%27

To save you some reading, in newer BioPython versions there is no GC function. That means you either have to dig through magpurify source and replace GC with its current equivalent (not recommended), or you downgrade BioPython:

pip install biopython==1.79
ADD COMMENT
0
Entering edit mode

Thanks

What about phylo-markers module error, I tried to look for it :) But haven't found anything. And why it's not working for some but not all bins...

ADD REPLY
0
Entering edit mode

To me that sounds like an error in sequence header formatting. Maybe some contig headers are repeated? Or their sequences are short or missing? Are there N or X characters in such contigs?

Another general advice: if a tool works for some contigs but not the others, see what is different about those contigs.

ADD REPLY
0
Entering edit mode

Thank you

ADD REPLY
0
Entering edit mode

Strange but there are no such headers or contig names in my fasta files

ADD REPLY

Login before adding your answer.

Traffic: 4100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6