Entering edit mode
7 months ago
bioinfo
▴
150
Hello,
We run the same samples using two different kits. With one kit we got about 30 million reads per samples and with the other about 10 million reads. My plan initially was to do a scatter-plot between the counts for each kit. However, I am not sure if I can do this since I had such different amount of reads for the samples.
Is there another way that I can compare them?
Thank you
Unequal numbers of reads are normal, that is why normalization is needed. If you want to do simple analysis such as number of detected genes and correlation plots why not just subsampling the larger file to also have 10mio reads? Or use some sort of scaling normalization, as e.g. implemented in edgeR or DESeq2.
Thank you. I ended up using seqtk to downsample the data.