I need some help/ideas in calculating average depth per SNP across samples. I have already generated depth files for each sample (around 18000 samples in total) using
samtools depth input.bam > output.txt and I have merged all the
output.txt's into 1 txt.gz. The input file is sorted by position and it looks like this.
sample1 17 123456 1000 sample2 17 123456 500 sample3 17 123456 10 sampleN 17 123456 50 sample1 17 123457 1100 sampleN 17 123457 50
so first column contains sample names second column is chromosome number third column is the snp position and the last column is depth. I am just wondering if there is quick way to get average depth per SNP across samples as my input file is huge (around 38gb)? or if there is an alternate way?
Looking forward to your suggestions and ideas.