Extract consensus gene sequence (from gff file) from bam file
2
0
Entering edit mode
8.5 years ago
AW ▴ 350

Hi,

I have a bam file from Illumina DNA-seq that I obtained by mapping reads to the reference genome using Bowtie. I want to extract the consensus sequence of a specific gene from the bam file irrespective of the reference. The gene coordinates are present in the gff file but I did not use this when mapping to the reference.

Thanks very much!

bam DNA-seq gff • 3.6k views
ADD COMMENT
0
Entering edit mode

You can use SAMtools to specify the chromosome/interval of the gene, and its variant-calling workflow to identify the differences b/t the reference and your data.

ADD REPLY
0
Entering edit mode

Thanks! How do I go about doing this? I've only seen examples specifying one region but I want to extract the multiple exons and not intron?

ADD REPLY
0
Entering edit mode
8.4 years ago
AW ▴ 350

I have tried the following but get an error

samtools view -b input.bam "chr1:14382-15521" > subset.bam
samtools sort subset.bam subset_sort
samtools mpileup -d 10000000 -uf ref.fna subset_sort.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq

Error: Could not parse --min-ac g
[mpileup] 1 samples in 1 input files
Use of uninitialized value $l in numeric lt (<) at /bcftools/bin/vcfutils.pl line 566.
Use of uninitialized value $l in numeric lt (<) at /bcftools/bin/vcfutils.pl line 566.

Any ideas?

ADD COMMENT
0
Entering edit mode

The 'c' flag in bcftools view requires an integer value.

ADD REPLY

Login before adding your answer.

Traffic: 2826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6