Question

working with chromosome bam files

0

Entering edit mode

10.9 years ago

yotsus2011 • 0

I have initially cut my .bam files to the specified chromosomes using samtools with the following code:

samtools sort temp.bam temp.sorted
samtools index temp.sorted.bam 
samtools view -bh temp.bam xx > temp.chrxx.bam

I am planning to align these sequences to the corresponding chromosome using bwa. I have already download the chromosome specific sequence of chrxx from UCSC.

bwa mem chrxx.fa mybam.bam > bwa.outxx.sam
bwa aln -t 4 chrxx.fa mybam.bam > outxx.bwa.sai
bwa samse chrxx.fa  outxx.bwa.sai mybam.bam > bwa.samse.outxx.sam

Since the output is a sam file, I would like to change this into a bam file, using samtools to then sort and index it again before processing for Quality control.

I used the command samtools view -bT chrxx.fa bwa.samse.outxx.sam > outxx.bwa.bam

Yet there is an error that occurs with the sam output from bwa alignment. If I were to show the upper ten lines only the upper two lines are shown:

@SQ    SN:chr17    LN:81195210
@PG    ID:bwa    PN:bwa    VN:0.7.10-r789    CL:chr1xx.fa    outxxbam.sai /Volumes/Pegasus/tmp/out17.bam

The error seen is

[samopen] SAM header is present: 1 sequences.
[sam_read1] reference 'ID:bwa    PN:bwa    VN:0.7.10-r789    CL:bwa samse chrxx.fa  /outxx.bam
' is recognized as '*'.
[main_samview] truncated file.

Please provide any help so that I can fix this issue. I would like to know what I am doing wrong. Thank you. Any assistance is appreciated.

genome sequence • 2.7k views

ADD COMMENT • link updated 3.6 years ago by Ram 45k • written 10.9 years ago by yotsus2011 • 0

0

Entering edit mode

You don't align a BAM, you align FASTQ files.

bwa mem [options] <idxbase> <in1.fq> [in2.fq]

Is there any reason to align each read on each chromosome rather to align on the whole genome ?

ADD REPLY • link updated 3.6 years ago by Ram 45k • written 10.9 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

I am only working with 2 chromosomes of each patient and I have a huge patient pool so I would like to minimize the processing time. Do I need to convert all the .bam file into .fq files?

ADD REPLY • link updated 3.6 years ago by Ram 45k • written 10.9 years ago by yotsus2011 • 0

0

Entering edit mode

Firstly, are you sure that the BAM files contain only unaligned reads? While this can be the case, it's typically not.

Edit: I was wrong about BWA and BAM input, I've removed that line. Though I should mention that I only know that bwa aln can do that (I'm not familiar with bwa mem having that capability).

ADD REPLY • link 10.9 years ago by Devon Ryan 105k