I have 18000 text files containing depth for each position in sample. each text file corresponds to 1 sample (total 18000 sample). I wanted to get the mean coverage, standard deviation and total count of positions per sample in a single output file. I was just wondering if there is an easy way to do it? depths were calculated using
samtools depth input.bam. all the text files looks like this...
sample_name chromosome position depth
so the desired output is..
sample1 mean_depth standard_deviation total_number_of_positions sample2 mean_depth standard_deviation total_number_of_positions sample1 mean_depth standard_deviation total_number_of_positions
Are the solutions in your last question not suitable : average depth across samples
this is a different question so I thought of asking it in a different post. there I wanted average depth per position across all the samples. here I need per sample average depth, sd and counts for total number of positions.
You can probably calculate that using some variation of
datamashsolution that was posted by @cpad0112 in last question. Tinkering with things is a great way to learn.
You should also validate answers for your past questions, if they helped you solve the issue (green check mark besides answers). You can accept more than one if they all work.
ok thank you so much for the information.