We have coordinates for putative promoter regions, of the form <chromosome number>:<start>-<end>, e.g. 1:1-1500, one set for each of four breeds of Potato (S. tuberosum).
We have also assembled genomes, whose output is in the form of sorted.bam files, again one for each of four potato breeds, which we have created by using bwa mem by mapping four sets of read-pairs against the soltub3.0 reference genome from Ensembl plants.
My immediate task is to obtain a single FASTA sequence for every putative promoter for just one of our assembled genome breeds, which we can call the "Georgina" breed.
Progress so far
Since we have both (a) a sorted.bam genome, and (b) a list of coordinates for our promoters for the Georgina breed, for each coordinate I can do this:
samtools view input.bam 1:18000-45500
Which outputs to the terminal all reads falling within that region.
Is there a way to output/pipe the samtools view statement, above, either using a flag/other program so that instead of multiple reads falling within a region, the output is a single "consensus" sequence of the multiple reads falling within a region, whole output is perhaps as a FASTA file rather than a bam/sam/etc. output?
Huge thanks in advance, and the your time in reading this.