Question

Metagenomics normalization for long reads

1

Entering edit mode

2.5 years ago

fonteneaudam ▴ 20

Hello everyone,

I am currently working on a metagenomics analysis using ONT and I have used epi2me wimp for classification of the reads. What I want is to plot the relative abundance of each species present in the sample.

Epi2me outputs a read count assigned to each taxa so I can produce a plot for the proportion of reads per taxa but since each reads has a different length should I normalize by read length ?

metagenomics normalization nanopore • 796 views

ADD COMMENT • link 2.5 years ago by fonteneaudam ▴ 20

score 3 · Accepted Answer · 2021-10-12

3

Entering edit mode

2.5 years ago

colindaven 6.3k

The more relevant normalization would be

normalize to one million sequenced reads per sample (if you sequenced more reads, you'll get more hits)
normalize to bacterial genome size (if the bacteria has a big genome, you'll get more reads from it compared to one with a small genome)

We cover these in our pipeline Wochenende https://github.com/MHH-RCUG/Wochenende

ADD COMMENT • link 2.5 years ago by colindaven 6.3k

1

Entering edit mode

I get it one read equals one fragment no matter its size but the genome size is of influences on the number of fragment so I have to normalize for that. Thanks

ADD REPLY • link 2.5 years ago by fonteneaudam ▴ 20