This is my first question on the site, so thanks for all the help I've gotten from lurking, and thanks for the help I'm about to get. I'll try to share my research and follow all the etiquette. Can anyone help me with the following situation?
I'm working with the python packages Pysam and Pysamstats. I am using BAM and FASTA files from the 1000genomes project. I would like to provide a 50 base-pair interval on chromosome 22 in HG18 coordinates and in turn receive the average depth of coverage from that interval. My attempts so far have opened the files first, then fed them into pysamstats.stat_coverage_binned, like this.
fasta_file = pysam.Fastafile(<fasta_ftp_path>) bamfile = pysam.AlignmentFile(<alignment_ftp_path>) pysamstats.stat_coverage_binned(bamfile, fasta_file, <chrom>, <start_pos>, <end_pos>).
From the first line, I encounter this:
ValueError: could not locate index file
The same error appears on the third line if I try to feed in the ftp addresses rather than file objects. My understanding is that these packages look for index files by appending ".bai" or ".fai" to the filename I feed them, which should work (you can look at the files available further down my post). This is supported when I RTFS at line 142 of this file, which has my error message. I can't quite read cython, but given "filename.fa" I think it says to look for "filename.fa.fai". Here's TFS:
A minimal example to generate the error (sorry about the formatting; can't figure out how this will render):
import os import pysam import pysamstats alignment_ftp_path = "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/" \ "other_exome_alignments/HG00096/exome_alignment/" alignment_filename = "HG00096.mapped.illumina.mosaik.GBR.exome.20111114.bam" alignment_ftp_path = os.path.join(alignment_ftp_path, alignment_filename) fasta_ftp_path = "ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/" \ "GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa" fasta_file = pysam.Fastafile(fasta_ftp_path) bamfile2 = pysam.AlignmentFile(filename = alignment_ftp_path) ex_bin_read_depth = pysamstats.stat_coverage_binned(bamfile2, fasta_ftp_path,\ chrom="22", \ start=14481425, \ end=14481475)