I'm going to leave this here in case it is helpful, or someone has more up-to-date information to add. But I just did some smarter googling and found some similar questions with answers:
Should Samtools Pileup Be Performed On Uniquely Mapped Reads Or All The Reads?
Genomic Coverage - Samtool's undocumented "depth" verses the poorly documented pileup.
Discrepancy In Samtools Mpileup/Depth And Bedtools Genomecoveragebed Counts
I'm doing ChIP-Seq analysis and want the read depth at each and every genome position aligned in a set of bam files, includign 0s for positions covered in one bam file but not another. Totally raw-no score filtering. Both
mpileup accept a list of bam files.
Computes the depth at each position or region.
a -a, -aa: Output absolutely all positions, including unused reference sequences
In the pileup format (without
-g), each line represents a genomic position, consisting of chromosome name, 1-based coordinate, reference base, the number of reads covering the site
It sounds like both sort of do what I'm looking for. But I'm a fairly green bioinformaticist, and I don't know if I'm missing something in the difference between "number of reads covering the site" (
mpileup) and "depth" (
mpileup docs go into a lot more details about the output format.
It seems like
mpileup has more options and offers more detail than depth. For my purposes, is there a difference in 1) function or 2) performance?