Question: DIAMOND blast imported into MEGAN6 has no taxonomic assignment
3
gravatar for Farbod
3.6 years ago by
Farbod3.3k
Toronto
Farbod3.3k wrote:

Dear Friends, Hi (not native in Eng...)

I have used DIAMOND for creating a .daa file after blastx my transcriptome.fasta assembly against NCBI nr database with this script:

> diamond blastx -d nr -q '/home/Trinity_pathless.fasta' -o diamond-Trinity-daa -p 22 -f 100 --evalue 0.000001  --sensitive

Then I have imported it in the MEGAN6 community version (I have tried both approach (1) direct import and then create a MEGAN6 "RMA" file and (2) using Meganaizer tool - according to last lines of the MEGAN manual),

but the result has no taxonomic data!

Please help me in this regard and thank you in advance

NOTE: It seems that the MEGAN6 manual did not offer any guidance about blast taxonomy parameters.

NOTE2: If you are aware of any MEGAN6 problem-solver groups, please let me know.

diamond megan6 blast taxonomy • 3.7k views
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Farbod3.3k
2

MEGAN has a dedicated user forum: http://megan.informatik.uni-tuebingen.de

Diamond doesn't give taxonomic IDs in the blast hits so you have to add them or use MEGAN to map them (GI/Accession to TaxID - you need the big NCBI files). If you've done that (is that your RMA file?) then make sure minSupport set to 1 and minSupportPercent set to 0 (off) - they control the minimum number of sequences that a taxon must have assigned to it for it to be displayed.

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by Tonor420

Dear Tonor , Hi.

Thank you for informing me about MEGAN forum.

1- After I have import my diamond.daa file in megan6, it created the RMA file, itself. does it solve any problem ?

2- what do you mean by "big NCBI files" ? if you kindly provide the links, I will download them

3- after downloading the NCBI files, how I can map them using MEGAN ?

ADD REPLYlink written 3.6 years ago by Farbod3.3k
1

The diamond blast output will typically just have the GI or Accession number in the blast hit (NCBI recently abolished GIs).

NCBI provides gi2_taxid_prot and prot_accession2taxid files from the FTP site: ftp://ftp.ncbi.nih.gov/pub/taxonomy/

You can use these In MEGAN to map your blast hits to the taxonomy, when you load in your data there should be options to specify the location of these taxonomy mapping files.

ADD REPLYlink written 3.6 years ago by Tonor420

Thank you for your help.

Have you even tried "Diamond_BLAST_add_taxonomic_info" and (if yes) is it appropriate for my situation or the NCBI pipeline that you have suggested is better ?

ADD REPLYlink written 3.6 years ago by Farbod3.3k

Haven't tried it, but looks suitable, although a little old (uses GIs instead of accessions) - might be best to try the MEGAN forum as to best tools for it

ADD REPLYlink written 3.6 years ago by Tonor420

Dear Tonor, Hi.

In the link you have provided there is :

1- gi_taxid_prot.zip (instead of gi2_taxid_prot )

2- there is a "accession2taxid" directory and in it the "prot.accession2taxid" file

did I must download and use these two files ?

ADD REPLYlink written 3.6 years ago by Farbod3.3k
1

Hi - it depends on what BLAST db you are using - and older one with give you GI numbers in your BLAST hits (so need gi2_taxid_prot), whereas a newer one will give you accession numbers (so need prot.accession2taxid) - does that make sense?

ADD REPLYlink written 3.6 years ago by Tonor420

Thank you, Yes.

I have a local database of NCBI nr which is contains multiple files and is downloaded from here (ncbi ftp) and I have downloaded nr.58.tar.gz recently.

Do I need only the " prot.accession2taxid" in this case ?

ADD REPLYlink written 3.6 years ago by Farbod3.3k
1

In MEGAN - you would go File -> Import from BLAST, select your diamond file in File, and then in Taxonomy tab, either click Use Accession or Use GI and then select the corresponding file. Although I've found this takes ages on my machine (not very high spec).

ADD REPLYlink written 3.6 years ago by Tonor420
1

So, this fact that the .daa file of DIAMOND output is not behave very well for taxonomic purposes of MEGAN6, is a little disappointing.

ADD REPLYlink written 3.6 years ago by Farbod3.3k

Dear Tonor, I have imported the .daa blast result in MEGAN6 and then the huge "prot.accession2taxid" but it shows only two nodes again !

ADD REPLYlink written 3.6 years ago by Farbod3.3k

Check your minSupport and minSupportPercentage are set to 1 and 0 respectively. I think this is in Options -> LCA parameters

Also double check that your diamond Blast result use accessions rather than GIs

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by Tonor420

Dear Tonor, Hi and thank I will try minSupport option you have mentioned.

About Accessions and GIs, the Diamond .daa file is a binary file but the normal tabular blastX I have done on the same data using Diamond showed the result as below :

TRINITY_DN212758_c0_g1_i1...XP_002531646.1...81.3 107...20 0 3323 199 305 2.9e-41 176.4

TRINITY_DN212728_c0_g1_i1...XP_014502021.1...89.2 37...4 0 3113 403 439 8.6e-10 71.2

TRINITY_DN212793_c0_g1_i1...XP_015200040.1...91.8 61...5 0 665 483 238 298 9.7e-23 115.9

In this situation, Do I must use "prot.accession2taxid" or other files ?

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by Farbod3.3k

yes the prot.accession2taxid is the one - if it still doesn't work - reckon you should go the MEGAN community site - the developer is pretty active there - there are probably a few extra ways to go

ADD REPLYlink written 3.6 years ago by Tonor420

May I ask, which version of diamond are you using? I have never used MEGAN but from an older version of diamond you get in FAQ of the manual:

Q: Reads imported into MEGAN lack taxonomic or functional assignment. A: MEGAN requires mapping files which need to be downloaded separately at the MEGAN website and configured to be used.

ADD REPLYlink written 3.6 years ago by Earendil20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2074 users visited in the last hour