Hello everyone,
I want to normalise my counts of reads (Nanopore long-reads) that map to resistance genes in my contigs obtained from metagenomic assembly.
I would have the table with the number of reads that mapped to each resistance gene. I don't know whether to apply TPM, because it is an RNA-seq method for short reads and mainly adjusted to the gene length, which I don't think is a problem for long reads. Is it too simple and insufficient to just do CPM to adjust for library size? Also, I understand that many authors consider metagenomic data to be highly compositional, should I apply some approach in that sense? On the other hand, I understand that downsampling can help, but I don't want to lose data. Do you know of any normalisation method or software that would be useful in this case?