Get sequence from aligned BAM file according to interval in BED file
2
0
Entering edit mode
8.2 years ago
bharata1803 ▴ 560

Hello,

I have a BED file which define an interval of 3'UTR. The example of my BED is like this:

7   127591299   127591705   ENSG00000004059 0   +
12  8940364 8941817 ENSG00000003056 0   -
11  64315966    64316738    ENSG00000173153 0   +
12  2803258 2805423 ENSG00000004478 0   +

The order is : Chromosome Start End Gene Name Length Strand

I need to extract sequence in aligned BAM with corresponding interval per gene name. Is there any tools to do that? Thank you.

Update :

I use samtool view to see specific sequence in the given interval. So, according to my BED file from the first line, I use this command :

samtools view 33_realigned.bam 7:127591429-127591705

And it gives subset of BAM (in SAM format) corresponds to that interval but not merged. Is it possible to get the merged sequence? Thank you

alignment sequence • 5.9k views
ADD COMMENT
0
Entering edit mode

It looks like you are trying to make consensus sequence out of a bam file.

ADD REPLY
1
Entering edit mode
8.2 years ago
george.ry ★ 1.2k

Samtools itself will do that for you (the new versions, anyway):

samtools view -L your.bed -b your.bam > subset.bam
ADD COMMENT
0
Entering edit mode
8.2 years ago
venu 7.1k

Make your BED a tab delimited file and use bedtools. For this task, you would use

intersectBed -abam aligned.bam -b ToExtract.bed > subset.bam
ADD COMMENT
0
Entering edit mode

Thank you for your suggestion. I tried it and it seems the result is different from what I expect. The bam result converted to fastq doesn't give me what I need. So, I want an output which will consist of the sequence corresponds to the BED. For example for BED with line:

7   127591299   127591705   ENSG00000004059 0   +

I want to get string of ACGT between those region. So, probably the output is like this :

7   127591299   127591705   ENSG00000004059 0   + ACTGACAGT..

Is it possible?

The result I want is like if I use

bedtools getfasta -fi fasta.in -bed bed.in -fo csv.out -tab -s -fullHeader

I don't have the fasta.in and all I have is bam file.

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6