I am trying to run deduplication on the bam files using the command -
umi_tools dedup -I 0ng_Rep1.sorted.bam --output-stats 0ng_Rep1.sorted.deDup.stats -S 0ng_Rep1.sorted.deDup.bam --method directional --log 0ng_Rep1.sorted.deDup.log.txt
However, sometimes it will work and sometimes the process gets killed by the system (may be because of excessive memory hogging)
I am running this on a system with 64gb ram using i9-12900 processor which didn't give me any trouble so far for bacterial NGS based data analysis. Any suggestions to avoid this problem will be helpful.
This is what is happening
The process is getting killed in the linux. Is there anyway to avoid the kill by oom.
i.sudbery may have some input.
You may not be able to run this on this machine if the process is running out of memory. May have to find a different machine. Dedup algorithms may need to keep a large amount of data in memory since they are comparing things in real time.