Question: Empty Cfg File For Breakdancer
0
gravatar for Charles Warden
5.1 years ago by
Charles Warden5.5k
Duarte, CA
Charles Warden5.5k wrote:

I have some paired-end alignment data from BWA, and I am trying to use BreakDancer to call structural variants (well, more precisely, I'm trying to use SVMerge, which is a pipeline that includes a step using BreakDancer).

I think I've partially solved my problem by using Picard AddOrReplaceReadGroups to add a @RG line in the header (I no longer get an error message complaining about he read counts). This allowed the bam2cfg command to run for a lot longer (~12minutes, instead of failing immediately). However, the .cfg file is still empty.

I've looked at the breakdancer-help archive, but it looks like there are a lot of similar unanswered questions. So, I'm guessing this is a common problem.

I've tried running bam2cfg.pl with no optional commands as well as "-q 20" (to lower the threshold for the quality filter (although I wouldn't have expected this to be a problem). I noticed a lot of people ran bam2cfg.pl with the -g and -h parameters, so I also tried that. However, nothing is produced beyond an empty .config file.

In other words, here is my current command:

 /opt/breakdancer-1.2/bin/bam2cfg.pl -q 20 -g -h file.bam > bd.config

This is my .bam file header:

@HD    VN:1.4    SO:unsorted
@SQ    SN:chr10    LN:135534747
@SQ    SN:chr11    LN:135006516
@SQ    SN:chr12    LN:133851895
@SQ    SN:chr13    LN:115169878
@SQ    SN:chr14    LN:107349540
@SQ    SN:chr15    LN:102531392
@SQ    SN:chr16    LN:90354753
@SQ    SN:chr17    LN:81195210
@SQ    SN:chr18    LN:78077248
@SQ    SN:chr19    LN:59128983
@SQ    SN:chr1    LN:249250621
@SQ    SN:chr20    LN:63025520
@SQ    SN:chr21    LN:48129895
@SQ    SN:chr22    LN:51304566
@SQ    SN:chr2    LN:243199373
@SQ    SN:chr3    LN:198022430
@SQ    SN:chr4    LN:191154276
@SQ    SN:chr5    LN:180915260
@SQ    SN:chr6    LN:171115067
@SQ    SN:chr7    LN:159138663
@SQ    SN:chr8    LN:146364022
@SQ    SN:chr9    LN:141213431
@SQ    SN:chrM    LN:16571
@SQ    SN:chrX    LN:155270560
@SQ    SN:chrY    LN:59373566
@RG    ID:1    PL:illumina    PU:barcode    LB:AGTGGT    SM:4_AGTGGT

Does anyone know any other strategies for getting the initial bam2cfg.pl step to work?

breakdancer • 2.6k views
ADD COMMENTlink modified 5.0 years ago by ernfrid380 • written 5.1 years ago by Charles Warden5.5k
2
gravatar for ernfrid
5.0 years ago by
ernfrid380
Saint Louis
ernfrid380 wrote:

bam2cfg looks at properly paired reads to determine the insert size distribution. If you have a file where most of reads are not properly paired (as in your Haloplex sample) then bam2cfg will not find enough reads to infer the insert size distribution and will generate an empty file. BreakDancer will not work on your data.

ADD COMMENTlink written 5.0 years ago by ernfrid380

Ok - thank you for confirming this!

ADD REPLYlink written 5.0 years ago by Charles Warden5.5k
0
gravatar for Charles Warden
5.1 years ago by
Charles Warden5.5k
Duarte, CA
Charles Warden5.5k wrote:

It looks like I have figured out what is going on.

If I run the same command (including SORT_ORDER=coordinate just to be safe, even thought the file is already sorted) on another sample, it works.

The difference between the samples is that my first sample was a HaloPlex exon capture experiment, whereas my second sample was a SureSelect exon capture experiment. I think something about the HaloPlex sample preparation is causing the program crash. I know that HaloPlex introduced a log of incorrect SNP calls (especially near the ends of the reads), so this seems reasonable to me.

ADD COMMENTlink written 5.1 years ago by Charles Warden5.5k

So, AddOrReplaceReadGroups is absolutely the right thing to do. I don't know much about the HaloPlex prep, but the last person I helped with this issue had a BAM file that didn't have read pairs that were oriented correctly (they had actually removed them). What do the BAM flags indicate in terms of read orientation? Also, can you tell me what version of BreakDancer you are running?

ADD REPLYlink written 5.0 years ago by ernfrid380

I am using BreakDancer 1.2

Do you mean checking the flagstat statistics? If so, the "properly paired" percentage is much higher for the SureSelect sample (97% versus 0.03%). The analysis pipelines are essentially the same except he HaloPlex was run a while back (February versus August) and the duplicate removal step has to be skipped for the HaloPlex sample (otherwise, I believe almost all the reads would be removed because there is one amplicon per exon).

ADD REPLYlink written 5.0 years ago by Charles Warden5.5k

Were you able to figure this out? I have tried using Samtools and picard to sort my bam file and I am still getting an empty .cfg file. I have tried with and without many of the options but have had no luck.

ADD REPLYlink written 7 months ago by durwa0040
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1874 users visited in the last hour