breakdancer - cfg creation and chi square problem
Entering edit mode
7.4 years ago
TriS ★ 4.6k

Hi all

I'm running breakdancer to identify fusion transcripts but I keep getting some warning/error that make me doubt the output.

1) cfg file creation: -g -h -q 40 -v 3.5 Tumor/kayrotypic.bam Normal/kayrotypic.bam > out.cfg

I keep getting the following errors (not sure if that's an error though):

[Mon Oct 26 17:06:03 2015] Processing bam: Results_RSN_01926694/kayrotypic.bam
[Mon Oct 26 17:06:10 2015] $recordcounter > $expected_max: 30001 > 30000
[Mon Oct 26 17:06:10 2015] Closing BAM file
[Mon Oct 26 17:06:10 2015] Send TERM signal for 28546
[Mon Oct 26 17:06:12 2015] samtools pid process 28546 is still there...
[Mon Oct 26 17:06:12 2015] invoking kill -9 on 28546 ...
[Mon Oct 26 17:06:12 2015] Closing samtools process : 28546
[Mon Oct 26 17:06:13 2015] Processing bam: Results_RSN_01926692/kayrotypic.bam
[Mon Oct 26 17:06:21 2015] $recordcounter > $expected_max: 30001 > 30000
[Mon Oct 26 17:06:21 2015] Closing BAM file
[Mon Oct 26 17:06:21 2015] Send TERM signal for 28563
[Mon Oct 26 17:06:23 2015] samtools pid process 28563 is still there...
[Mon Oct 26 17:06:23 2015] invoking kill -9 on 28563 ...
[Mon Oct 26 17:06:23 2015] Closing samtools process : 28563

The cfg file is created and looks like this:

readgroup:RS-01926694    platform:Illumina    map:Tumor/kayrotypic.bam    readlen:101.00    lib:lib1    num:5684    lower:0.00    upper:8696.40    mean:484.47    std:922.33    SWnormality:minus infinity    flag:0(38.70%)18(37.81%)2(4.72%)20(0.57%)32(6.23%)4(0.01%)64(11.97%)8(0.01%)30001    exe:samtools view
readgroup:RS-01926692    platform:Illumina    map:Normal/kayrotypic.bam    readlen:101.00    lib:lib1    num:6826    lower:0.00    upper:12653.07    mean:506.75    std:1212.53    SWnormality:minus infinity    flag:0(25.52%)1(0.01%)18(45.25%)2(6.16%)20(0.51%)32(4.16%)4(0.02%)64(18.37%)30001    exe:samtools view

however, because of the error I don't know if that's correct

2) running breakdancer:

breakdancer-max -t -q 10 -f -d out.ctx out.cfg > out.ctx

It runs in a few minutes while the website says it should take from 12 hours to a few days

the output looks legit:

#Software: 1.4.5-unstable-60-3876c5f (commit 3876c5f)
#Command: breakdancer-max -t -q 10 -f -d BR9193.ctx BR9193.cfg 
#Library Statistics:
#Results_RSN_01926691/kayrotypic.bam    mean:495.81    std:1063.22    uppercutoff:10865.8    lowercutoff:0    readlen:101    library:lib1    reflen:3010442741    seqcov:3.07041    phycov:7.53633    32:1003520
#Chr1    Pos1    Orientation1    Chr2    Pos2    Orientation2    Type    Size    Score    num_Reads    num_Reads_lib    kayrotypic.bam    kayrotypic.bam
chr1    233971929    114+20-    chr2    264915    7+0-    CTX    -495    99    7    Results_RSN_01926691/kayrotypic.bam|7
chr1    233971929    114+20-    chr2    271874    10+0-    CTX    -495    99    10    Results_RSN_01926691/kayrotypic.bam|10
chr1    233971929    114+20-    chr2    272041    44+0-    CTX    -495    99    43    Results_RSN_01926691/kayrotypic.bam|43
chr1    233971929    114+20-    chr2    272221    18+0-    CTX    -495    99    18    Results_RSN_01926691/kayrotypic.bam|18

but I do get the following warning/error:

WARNING: at line 2, library lib1 overwritten!
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf

which makes me doubt the output too.

Also, the paper indicates that the distribution of insert size should be normal-like, while my distribution is heavily skewed on the left, like most RNASeq reads data.

So... I'm kinda of confused on how to interpret those results!

Any help would be great

Thanks :)

rnaseq breakdancer • 2.5k views
Entering edit mode

I do get "overwritten" warning, but no other messages.

However, i have also noticed that it takes about an hour (if you execute all command, not just -t parameter) or so to run break-dancer. You should definitely filter based on score, default value is -h 30. Similarly for -q, 35 is default where you have used 10, may be you want to change that.

How do you plan to identify high confidence CTX (translocation)?

Entering edit mode

I was going to keep CTXs in the ctx file with confidence score > 90

for the cfg file quality filter, yes, default is -h 30 but I kept getting the error of

$recordcounter > $expected_max: 30001 > 30000

which went away if I used a higher quality filter. I will try to use -q 35 and see what happens. thanks for the input

Entering edit mode

What command did you actually use to remove this error? Now, I am getting this error in some of my libraries.

Previously, I was using, -g -h $P1T $P1N > P1.cfg
Entering edit mode

I used the code in the post


Login before adding your answer.

Traffic: 1669 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6