Question: filtering genes out from bam file using samtools
0
gravatar for Sara
3.2 years ago by
Sara90
Sara90 wrote:

I have some bam files and would like to use them for further analysis. but I want to do more analysis only on some of the genes not everything in the bam file. I have a list of genes that I want to use for the next step. also in my list I have both gene symbol and gene ids like:

AAAS ENSG00000094914
ACO2 ENSG00000100412

I thought samtools can help to get such a bam file (like the following command), but I don't know what should be included instead of ? marks or if that is possible.

samtools view -h in.bam | ??????  > out.bam

actually I searched for that but did not find anything useful. do you know how I can get a new bam file only for the genes in my list?

alignment • 2.0k views
ADD COMMENTlink modified 3.2 years ago by Carlo Yague4.9k • written 3.2 years ago by Sara90
2
gravatar for Carlo Yague
3.2 years ago by
Carlo Yague4.9k
Canada
Carlo Yague4.9k wrote:

If you mapped the reads on the genome, you will need the genome coordinates of the genes you are interested in. Once you have that simply use the "regions" in samtools view :

samtools view --help
Usage: samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...]
[...]
 A region should be presented in one of the following formats:
 `chr1', `chr2:1,000' and `chr3:1000-2,000'. When a region is
 specified, the input alignment file must be a sorted and indexed
 alignment (BAM/CRAM) file.

samtools view -h -b in.bam region1 region2 ... regionX > out.bam

If you have many genes of interest, it can be more convenient to first convert their genomic coordinates in bed format (which is 0-based, FYI), then use samtools with the -L option :

-L FILE  only include reads overlapping this BED FILE [null]

samtools view -h -b -L my_genes_coordinates.bed in.bam > out.bam
ADD COMMENTlink modified 2.9 years ago • written 3.2 years ago by Carlo Yague4.9k
1

should be :

samtools view -h -b -L my_genes_coordinates.bed in.bam > out.bam
ADD REPLYlink written 2.9 years ago by tiago2112871.1k

nice catch, I edited, thank you !

ADD REPLYlink written 2.9 years ago by Carlo Yague4.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1745 users visited in the last hour