Importing MetaPhlAn3 profile table into phyloseq to use decontam
0
1
Entering edit mode
3.7 years ago
plicht ▴ 20

Hi there,

I am new to R and would like to import the taxonomy profile table of MetaPhlAn3 into the R package phyloseq to make use of the package decontam.

Therefore I merged several metaphlan analyses with the metaphlan internal command "merge_table". Then I imported the data into R using the read.table command:

merged_metaphlan <- read.table("/media/sf_projects/microbiome/Analysis_of_microbiome/WiP/KneadData/firsttry/Validation_Samples_PL018/PL0183103_5/subsamples/visualization/merged_subsamples_samples_1,2,5.txt", header = TRUE)

After that, I wanted to assign this to an otu_table and consequently load this into the phyloseq-class object:

otu_table(merged_metaphlan, taxa_are_rows = TRUE)

But it seems that phyloseq expects a matrix. Since the MetaPhlAn tabel comes with characters (the clade names and the according relative abundances), I receive the following error:

Error in validObject(.Object) : invalid class “otu_table” object: 
 Non-numeric matrix provided as OTU table.
Abundance is expected to be numeric.

Is there a way to directly import MetaPhlAn tables into phyloseq? Or do I need a work around, and if so, how can I do it?

software error R • 3.9k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Using software error tag is pretty meaning less. MetaPhlAn3 would be a more useful tag to add.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Hi, did you manage to figure this out? I'm trying to do the same thing and I'm getting this error!

ADD REPLY
0
Entering edit mode

Please add some demo/ example data to your post. @ c.e.chong

ADD REPLY
0
Entering edit mode

I have run metaphlan3 and have a merged abundance output file that looks like this :

clade_name  healthy_mphlan  dandruff_mphlan dandruffhealthy_mphlan
k__Bacteria 91.40268    71.86512    89.7509
k__Bacteria|p__Actinobacteria   86.36566    49.51296    77.30806
k__Bacteria|p__Actinobacteria|c__Actinobacteria 86.36566    49.51296    77.30806
k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales  0.1044  0.11737 0.62909
k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Actinomycetaceae  0.1044  0.11737 0.62909
k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Actinomycetaceae|g__Actinobaculum 0.01359 0.03944 0.09785

I want to input this into phyloseq. I used the command

metaphlan <- read.table("statemerged_abundance_table_reformatted.txt", header = TRUE)
otu_table(metaphlan, taxa_are_rows = TRUE)

This gave me the error:

Error in validObject(.Object) : invalid class "otu_table" object:  Non-numeric matrix provided as OTU table. Abundance is expected to be numeric.

I want to get this data into phyloseq so I can analyse it with deseq2 afterwards. I'm not sure how to create phyloseq objects from the metaphlan table. Do you have any expertise in this?

Thanks in advance!

ADD REPLY
1
Entering edit mode
library(phyloseq)
df=read.csv("test.txt", sep="\t", strip.white = T, stringsAsFactors = F, row.names = 1)

test

copy/pasted from https://github.com/wipperman/wipperman/blob/master/R/microbiota.R:
##########################################################################################
> metaphlanToPhyloseq <- function(
    tax,
    metadat=NULL,
    simplenames=TRUE,
    roundtointeger=FALSE,
    split="|"){
    ## tax is a matrix or data.frame with the table of taxonomic abundances, rows are taxa, columns are samples
    ## metadat is an optional data.frame of specimen metadata, rows are samples, columns are variables
    ## if simplenames=TRUE, use only the most detailed level of taxa names in the final object
    ## if roundtointeger=TRUE, values will be rounded to the nearest integer
    xnames = rownames(tax)
    shortnames = gsub(paste0(".+\\", split), "", xnames)
    if(simplenames){
        rownames(tax) = shortnames
    }
    if(roundtointeger){
        tax = round(tax * 1e4)
    }
    x2 = strsplit(xnames, split=split, fixed=TRUE)
    taxmat = matrix(NA, ncol=max(sapply(x2, length)), nrow=length(x2))
    colnames(taxmat) = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species", "Strain")[1:ncol(taxmat)]
    rownames(taxmat) = rownames(tax)
    for (i in 1:nrow(taxmat)){
        taxmat[i, 1:length(x2[[i]])] <- x2[[i]]
    }
    taxmat = gsub("[a-z]__", "", taxmat)
    taxmat = phyloseq::tax_table(taxmat)
    otutab = phyloseq::otu_table(tax, taxa_are_rows=TRUE)
    if(is.null(metadat)){
        res = phyloseq::phyloseq(taxmat, otutab)
    }else{
        res = phyloseq::phyloseq(taxmat, otutab, phyloseq::sample_data(metadat))
    }
    return(res)
}
##########################################################
> metaphlanToPhyloseq(df)
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 6 taxa and 3 samples ]
tax_table()   Taxonomy Table:    [ 6 taxa by 6 taxonomic ranks ]
ADD REPLY
0
Entering edit mode

Thank you so much for your help. I have managed to recreate what you did with this function. The function on wipperman GitHub however does not create a sample_data() as well as the tax table and out table which I need. The function from the Waldron lab that you linked first on this post does, but I cannot get this to work. Do you have any experience with this function? I have put my issues on this post (https://www.biostars.org/p/456397/#456575).

I'm very grateful for your help!

ADD REPLY
0
Entering edit mode

Unless input/example files and expected out put are added to the post, it is difficult to address the post @ c.e.chong

ADD REPLY
0
Entering edit mode

Hey c. e. chong,

sorry for getting back so lately. Did you manage to get MetaPhlAn into Phyoloseq? Do you use total read counts (-t rel_ab_with_read_counts) or du you use relative abundances (-t rel_ab) when using the calculations in that package? Do you also plan to make use of Decontam?

I guess we are working on quite the same topics, so maybe we should join forces here?

Best Philipp

ADD REPLY
0
Entering edit mode

Hey Everyone! did anyone manage to get the alpha beta-diversity from importing the metaphlan3 output into phyloseq? If yes please help me too to get there. All I know is that we can get the read count information from the metaphlan3 by rel_ab_w_read_stats after which I will make the merged metaphlan table using the command merge_metaphlan_tables.py script. How Do I import this table into phyloseq to get the alpha and beta diversity? I will provide the example files of metaphlan3 output if needed.

Thanks in advance!!

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6