Question

Does ASVs IDs from dada2 have specific taxonomic assignments?

0

Entering edit mode

19 months ago

v.berriosfarias ▴ 140

Hello Community, I'm trying to subset matrices of ASVs's associations that were produced from 2 different phyloseq objects.

For downstream analysis I need to subset the square matrices in a way that the exactly same taxa are contained in both matrices (In matrix 1 I have 700 taxa and in matrix 2 I have 400 taxa).

The row and column names of the matrices are the ASVs IDs (ASV1, ASV2, ASV3, ASV4.... ASVn). So with the assumption that ASVs ID's are to a specific taxa (e.g. ASV451 is to E. Coli) and that with the which() R function I can locate the matching ASVs on the other matrix, I did the following:

new_matrix1_ASVs<- matrix1_ASVs[which(rownames(matrix_1_ASVs) %in% rownames(matrix2_ASVs)),
                                       which(rownames(matrix_1_ASVs) %in% rownames(matrix2_ASVs))]


new_matrix2_ASVs <- matrix2_ASVs[which(rownames(matrix2_ASVs) %in% rownames(matrix1_ASVs)),
                                       which(rownames(matrix2_ASVs) %in% rownames(matrix1_ASVs))]

By doing the above, I get both matrices with the same dimensions but not sure if the same ASVs IDs are to the same taxa, is the above mentioned assumption correct?

dada2 R ASVs • 1.3k views

ADD COMMENT • link updated 6 months ago by andres.firrincieli 3.6k • written 19 months ago by v.berriosfarias ▴ 140

1

Entering edit mode

ASVs ID's are not taxa specific.

ADD REPLY • link 19 months ago by andres.firrincieli 3.6k

0

Entering edit mode

Ok so it was a bad approach to subset the matrices on that way.

ADD REPLY • link 19 months ago by v.berriosfarias ▴ 140

0

Entering edit mode

Hi, did you succeed to deal with your two datasets? Giving the same ASV name for your two datasets?

ADD REPLY • link 6 months ago by pablo ▴ 300

score 3 · Accepted Answer · 2022-09-15

3

Entering edit mode

19 months ago

Asaf 10k

The ASV IDs you mentioned are random (ASV1, ASV2 ...) so you can't directly compare them between two datasets. What you _can_ do is either:

Get a taxonomic identification for your ASVs (using DADA2 or qiime for instance), then agglomerate the data on the Genus level (or higher level if you wish, but I wouldn't recommend the Species level unless most of the ASVs have Species assignment) using phyloseq tax_glom. Once you have a table of genera X samples with summed up read counts you can merge the two tables and compare them or compute the genera association in each experiment and then compare.
You can rerun the two samples together in DADA2 and get consistent ASV IDs.

ADD COMMENT • link 19 months ago by Asaf 10k

0

Entering edit mode

Hi Asaf,

I have 3 datasets, and so, 3 phyloseq ps objetcs after running DADA2. As the topic said, I need to get the same ASV names between my 3 datasets (eg. ASV1 should be the same sequence for the 3 datasets).

I did tax_glom on my 3 ps objects on the Genus level with tax_glom(ps.pool1, taxrank="genus") , then used ps.merged <- merge_phyloseq(ps.pool1.genus, ps.pool2.genus, ps.pool3.genus) to create a unique object.

Then I export this object to get an abundance table (read on Excel) but it is wrong. There is the number of merged samples for each ASV (eg. 240 lines for ASV1, 240 lines for ASV2 ..) what creates a huge file. I would need one line per ASV, one column / sample and the corresponding abundance.

Any help?

ADD REPLY • link 6 months ago by pablo ▴ 300

0

Entering edit mode

I think that the cleanes way to achieve this is to run dada2 on each dataset and then use qiime feature-table merge and qiime feature-table merge-seqs (if you are using qiime2) to create a single ASV table and reference fasta file. If you are working with just dada2, the same thing can be achieved with [mergeSequenceTables][1].

ps. This method only work if the same set of primers have been used for each dataset and that the runs have trimmed in the same fashion

ADD REPLY • link 6 months ago by andres.firrincieli 3.6k