Question: working with chromosome bam files
4.7 years ago by
yotsus20110 wrote:

I have initially cut my .bam files to the specified chromosomes using samtools with the following code: 

samtools sort temp.bam temp.sorted
samtools index temp.sorted.bam 
samtools view -bh temp.bam xx > temp.chrxx.bam

I am planning to align these sequences to the corresponding chromosome using bwa. I have already download the chromosome specific sequence of chrxx from USCS.

bwa mem chrxx.fa mybam.bam > bwa.outxx.sam
bwa aln -t 4 chrxx.fa mybam.bam > outxx.bwa.sai
bwa samse chrxx.fa  outxx.bwa.sai mybam.bam > bwa.samse.outxx.sam 

Since the output is a sam I would like to change this into a bam file, using samtools to then sort and index it again before processing for Quality control.  

I used the command samtools view -bT chrxx.fa bwa.samse.outxx.sam > outxx.bwa.bam

Yet there is an error that occurs with the sam output from bwa alignment.  If I were to show th upper ten lines only the upper two lines are shown: 

@SQ    SN:chr17    LN:81195210
@PG    ID:bwa    PN:bwa    VN:0.7.10-r789    CL:chr1xx.fa    outxxbam.sai /Volumes/Pegasus/tmp/out17.bam

The error seen is 

[samopen] SAM header is present: 1 sequences.
[sam_read1] reference 'ID:bwa    PN:bwa    VN:0.7.10-r789    CL:bwa samse chrxx.fa  /outxx.bam
' is recognized as '*'.
[main_samview] truncated file.

please provide any help so that I can fix this issue.  I would like to know what I am doing wrong.  Thank you.  Any assistance is appreciated. 

sequence genome • 1.5k views
you don't align a BAM, you align FASTQ files.

 bwa mem [options] <idxbase> <in1.fq> [in2.fq]

is there any reason to align each read on each chromosome rather to align on the whole genome ?

I am only working with 2 chromosomes of each patient and I have a huge patient pool so I would like to minimize the processing time.  Do I need to convert all the *.bam file into *.fq files?


Firstly, are you sure that the BAM files contain only unaligned reads? While this can be the case, it's typically not.

Edit: I was wrong about BWA and BAM input, I've removed that line. Though I should mention that I only know that bwa aln can do that (I'm not familiar with bwa mem having that capability).

