DIAMOND blast imported into MEGAN6 has no taxonomic assignment
0
4
Entering edit mode
5.6 years ago
Farbod ★ 3.3k

Dear Friends, Hi (not native in Eng...)

I have used DIAMOND for creating a .daa file after blastx my transcriptome.fasta assembly against NCBI nr database with this script:

> diamond blastx -d nr -q '/home/Trinity_pathless.fasta' -o diamond-Trinity-daa -p 22 -f 100 --evalue 0.000001  --sensitive


Then I have imported it in the MEGAN6 community version (I have tried both approach (1) direct import and then create a MEGAN6 "RMA" file and (2) using Meganaizer tool - according to last lines of the MEGAN manual),

but the result has no taxonomic data!

NOTE: It seems that the MEGAN6 manual did not offer any guidance about blast taxonomy parameters.

NOTE2: If you are aware of any MEGAN6 problem-solver groups, please let me know.

blast MEGAN6 Taxonomy DIAMOND • 5.1k views
2
Entering edit mode

MEGAN has a dedicated user forum: http://megan.informatik.uni-tuebingen.de

Diamond doesn't give taxonomic IDs in the blast hits so you have to add them or use MEGAN to map them (GI/Accession to TaxID - you need the big NCBI files). If you've done that (is that your RMA file?) then make sure minSupport set to 1 and minSupportPercent set to 0 (off) - they control the minimum number of sequences that a taxon must have assigned to it for it to be displayed.

0
Entering edit mode

Dear Tonor , Hi.

Thank you for informing me about MEGAN forum.

1- After I have import my diamond.daa file in megan6, it created the RMA file, itself. does it solve any problem ?

2- what do you mean by "big NCBI files" ? if you kindly provide the links, I will download them

1
Entering edit mode

The diamond blast output will typically just have the GI or Accession number in the blast hit (NCBI recently abolished GIs).

NCBI provides gi2_taxid_prot and prot_accession2taxid files from the FTP site: ftp://ftp.ncbi.nih.gov/pub/taxonomy/

You can use these In MEGAN to map your blast hits to the taxonomy, when you load in your data there should be options to specify the location of these taxonomy mapping files.

0
Entering edit mode

Have you even tried "Diamond_BLAST_add_taxonomic_info" and (if yes) is it appropriate for my situation or the NCBI pipeline that you have suggested is better ?

0
Entering edit mode

Haven't tried it, but looks suitable, although a little old (uses GIs instead of accessions) - might be best to try the MEGAN forum as to best tools for it

0
Entering edit mode

Dear Tonor, Hi.

In the link you have provided there is :

1- gi_taxid_prot.zip (instead of gi2_taxid_prot )

2- there is a "accession2taxid" directory and in it the "prot.accession2taxid" file

1
Entering edit mode

Hi - it depends on what BLAST db you are using - and older one with give you GI numbers in your BLAST hits (so need gi2_taxid_prot), whereas a newer one will give you accession numbers (so need prot.accession2taxid) - does that make sense?

0
Entering edit mode

Thank you, Yes.

I have a local database of NCBI nr which is contains multiple files and is downloaded from here (ncbi ftp) and I have downloaded nr.58.tar.gz recently.

Do I need only the " prot.accession2taxid" in this case ?

1
Entering edit mode

In MEGAN - you would go File -> Import from BLAST, select your diamond file in File, and then in Taxonomy tab, either click Use Accession or Use GI and then select the corresponding file. Although I've found this takes ages on my machine (not very high spec).

1
Entering edit mode

So, this fact that the .daa file of DIAMOND output is not behave very well for taxonomic purposes of MEGAN6, is a little disappointing.

0
Entering edit mode

Dear Tonor, I have imported the .daa blast result in MEGAN6 and then the huge "prot.accession2taxid" but it shows only two nodes again !

0
Entering edit mode

Check your minSupport and minSupportPercentage are set to 1 and 0 respectively. I think this is in Options -> LCA parameters

Also double check that your diamond Blast result use accessions rather than GIs

0
Entering edit mode

Dear Tonor, Hi and thank I will try minSupport option you have mentioned.

About Accessions and GIs, the Diamond .daa file is a binary file but the normal tabular blastX I have done on the same data using Diamond showed the result as below :

TRINITY_DN212758_c0_g1_i1...XP_002531646.1...81.3 107...20 0 3323 199 305 2.9e-41 176.4

TRINITY_DN212728_c0_g1_i1...XP_014502021.1...89.2 37...4 0 3113 403 439 8.6e-10 71.2

TRINITY_DN212793_c0_g1_i1...XP_015200040.1...91.8 61...5 0 665 483 238 298 9.7e-23 115.9

In this situation, Do I must use "prot.accession2taxid" or other files ?

0
Entering edit mode

yes the prot.accession2taxid is the one - if it still doesn't work - reckon you should go the MEGAN community site - the developer is pretty active there - there are probably a few extra ways to go

0
Entering edit mode

May I ask, which version of diamond are you using? I have never used MEGAN but from an older version of diamond you get in FAQ of the manual:

Q: Reads imported into MEGAN lack taxonomic or functional assignment. A: MEGAN requires mapping files which need to be downloaded separately at the MEGAN website and configured to be used.