I am interested in downloading aligned nucleotide sequences for humans and Neanderthals. I want two human sequences (e.g. French and San) and one Neanderthal sequence. I want to perform a test similar to one performed in "A Draft Sequence of the Neandertal Genome" by Green et al. for gene flow between humans and Neanderthals. On page 130 of the supplementary material, they outline a test for gene flow where they compare states of SNPs for two human populations (French and San), Neanderthal and chimpanzee. On page 138 they have the frequencies of all possible allele patterns of matches and mismatches across the four populations. For example, the frequency of AAAA is the number of sites where all four populations are the same (e.g. the number of sites that are all A's, all G's, all C's or all T's). Likewise, ABBA (reading French, San, Neanderthal, chimpanzee from left to right) is the number of sites where French and chimpanzee are the same and San and Neanderthal are the same (e.g. AGGA, ACCA, ATTA, GAAG, ...). These frequencies are almost what I want, but I want the individual base frequencies instead (e.g. AAAA, GGGG, CCCC, TTTT, AGGA, ACCA, ...). I would like frequencies of all of the 256 possible base combinations. If someone has done this already then that would be great. If not, then I will need to get the aligned sequences and do it myself.
If I have to download the data myself then I will need some help. I have spent hours attempting to find the relevant data, but it is all gobbledygook to me. My background is in mathematics and statistics and not bioinformatics. I have tried reading the manuals, but with little success. I need an ELI5 to obtain aligned nucleotide sequences for French, San and Neanderthals.
I suspect the data I am interested in is here: https://genome.ucsc.edu/Neandertal/, however, I don't know how to interpret it.
Other places where it might be are:
Any help would be greatly appreciated. Thanks.
Edit: I've downloaded the BAM files from here:
I also downloaded samtools and have worked out how to view the BAM files. How do I merge multiple BAM files so that they are aligned and then extract the nucleotide sequences?