This question could be considered as a follow up of this discussion.
What I need is to extract reads from bam file that fall only within a given region (not overlap the given region), the region being in the form of a gff file or bed file. Overlapping reads could be extracted by several methods (as in the discussion mentioned or BEDTools). The idea is to try to be pretty sure of excluding 5´ UTRs in the process of detecting intergenic transcripts. I saw a tool in BamUtil (http://genome.sph.umich.edu/wiki/BamUtil) called "writeRegion" which would pretty much do what I want. Somehow could not get this running for my dataset.
Was wondering if you guys might have an "R" or some other solution for this.
Thanks in advance
Heys, is there any option where you add a fasta file containing your regions of interest??
No, since fasta contains no positional information.
Thanks! Then how can I obtain the positional information from a fasta file?
Map to a reference genome. The process is called alignment. Tools are e.g.
bwa mem. If you need more guidance then please open a new questions, describing the problem and providing details what exactly you want to do and which data you have.