Question: Import metaphlan output to phyloseq
0
gravatar for dpc
6 months ago by
dpc150
India
dpc150 wrote:

Hi there

Can anyone please tell me how can I import MetaPhlAn biom or profile file to phyloseq for subsequent statistical analysis? (I have analysed my data with MetaPhlAn 3.0)

Thanks
DC7

ADD COMMENTlink modified 6 months ago by _r_am32k • written 6 months ago by dpc150
1

Hi,

Did you try?

physeq <- import_biom("<path-to-biom>")

Though this function is intended to import biom files produced in QIIME, with an option to parse the taxonomy. So, I'm not sure if it'll work in your case.

What you can do and it ensures some safety level is to convert your biom into a text file with the biom-format utilities and, then, import this table in text format to phyloseq. It may need a bit of parsing first.

António

António

ADD REPLYlink modified 6 months ago • written 6 months ago by antonioggsousa1.9k

Hi, I wam going to try according to your suggestion. Before that, a follow up query. I have 60 biom files. So, I have to merge them. Right? Do you know any such tools to merge them?

thanks, DC7

ADD REPLYlink written 6 months ago by dpc150

The different biom files, represent different samples?

António

ADD REPLYlink written 6 months ago by antonioggsousa1.9k

eaxctly....the different biom files, represent different samples...

ADD REPLYlink written 6 months ago by dpc150

I think you can merge samples inside Metaphlan: https://github.com/biobakery/metaphlan/

That would be the safest option.

António

ADD REPLYlink written 6 months ago by antonioggsousa1.9k

No, that's a different thing. There you can merge the abundance table only. You can't merge the biom tables.

ADD REPLYlink written 6 months ago by dpc150

But your biom files are not abundance tables too?

You could convert the biom tables to .txt and then merge them. I'm not very familiar with merging biom. It is possible on QIIME to merge multiple biom tables, but I don't know if it will work in your case.

António

ADD REPLYlink written 6 months ago by antonioggsousa1.9k

Yes... from metaphlan I get one biom file (with abundance), one abundance file... But biom from qiime and from metaphlan are of two different types. The first one is hdf5 and the later is in json format. So, there is the problem. Also, when I tried to convert the json into hdf5, it wasn't converted properly. If you please don't mind, will you like to do a check from your side if I send you one file?

Thanks, DC7

ADD REPLYlink written 6 months ago by dpc150

Did you try to convert your biom to text file with biom-convert and them merge then with the merge option of metaphlan?

António

ADD REPLYlink modified 6 months ago • written 6 months ago by antonioggsousa1.9k

Hi,

I've seen a a few of your posts @dpc and it seems like we're trying to do the same thing. I have used metphlantophyloseq.r functions to get my metaphlan rel ab data into phyloseq and want to put this into Deseq2.

Did you manage to do this? Did you covert the relative abundance into raw counts?

Thanks

ADD REPLYlink written 5 months ago by c.e.chong40

Hi Chong!!! Where did you get the function? Can you please share the link? What I have done is to format the metaphlan output exactly like that of 16S output from motur/qiime and then feed into phyloseq. Although you have to spend some time for prior formatting, it will easily be imported. Please also let me know if you get any solution of metphlantophyloseq.r package.

Thanks, dpc

ADD REPLYlink written 5 months ago by dpc150

Hi dpc,

These were my R commands :

##########
library(phyloseq)
library(DESeq2)
library(ggplot2)
library(curatedMetagenomicData)
mphlanin <- read.csv("merged_abundance_table_reformatted.txt", sep = "\t", strip.white = T, stringsAsFactors = F, row.names = 1)
metadata <- read.delim("metadata.txt", header=TRUE, sep = "\t")
metadatadf <- data.frame(metadata)
row.names(metadatadf) <- metadatadf$sample
samples_df <- metadatadf %>% select (-sample)
sample <- sample_data(samples_df)

phyloseqin= metaphlanToPhyloseq(mphlanin, metadat = sample)
phyloseqin
#phyloseq-class experiment-level object
#otu_table()   OTU Table:         [ 229 taxa and 22 samples ]
#sample_data() Sample Data:       [ 22 samples by 1 sample variables ]
#tax_table()   Taxonomy Table:    [ 229 taxa by 7 taxonomic ranks ]

This uses the function, from https://rdrr.io/github/wipperman/wipperman/src/R/microbiota.R:

#' convert the output of a metaphlan2_taxonomic_table_joined.tsv object to a otu_table + tax_table object
#' 
#' 
#' @param phyloseq object 
#' @param 
#' @export
#' @examples
#' mtph_tbru_phy <- metaphlanToPhyloseq(tax = mtph_tbru, split = "|")
metaphlanToPhyloseq <- function(
  tax,
  metadat=NULL,
  simplenames=TRUE,
  roundtointeger=FALSE,
  split="|"){
  ## tax is a matrix or data.frame with the table of taxonomic abundances, rows are taxa, columns are samples
  ## metadat is an optional data.frame of specimen metadata, rows are samples, columns are variables
  ## if simplenames=TRUE, use only the most detailed level of taxa names in the final object
  ## if roundtointeger=TRUE, values will be rounded to the nearest integer
  xnames = rownames(tax)
  shortnames = gsub(paste0(".+\\", split), "", xnames)
  if(simplenames){
    rownames(tax) = shortnames
  }
  if(roundtointeger){
    tax = round(tax * 1e4)
  }
  x2 = strsplit(xnames, split=split, fixed=TRUE)
  taxmat = matrix(NA, ncol=max(sapply(x2, length)), nrow=length(x2))
  colnames(taxmat) = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species", "Strain")[1:ncol(taxmat)]
  rownames(taxmat) = rownames(tax)
  for (i in 1:nrow(taxmat)){
    taxmat[i, 1:length(x2[[i]])] <- x2[[i]]
  }
  taxmat = gsub("[a-z]__", "", taxmat)
  taxmat = phyloseq::tax_table(taxmat)
  otutab = phyloseq::otu_table(tax, taxa_are_rows=TRUE)
  if(is.null(metadat)){
    res = phyloseq::phyloseq(taxmat, otutab)
  }else{
    res = phyloseq::phyloseq(taxmat, otutab, phyloseq::sample_data(metadat))
  }
  return(res)
}

I did this with the output of Metaphlan3 with -t rel_ab. Do you know if I can still use this in phyloseq and deseq2? You can't merge the tables from other options with metaphlan. Did you use relative abundance?

Thanks!

ADD REPLYlink written 5 months ago by c.e.chong40

I am using relative abundance... I think you can't use deseq2. Because it uses absolute count. Thanks

ADD REPLYlink written 4 months ago by dpc150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1709 users visited in the last hour
_