Question: breakdancer - cfg creation and chi square problem
0
gravatar for TriS
2.1 years ago by
TriS3.1k
United States, Buffalo
TriS3.1k wrote:

hi all

I'm running breakdancer to indentify fusion transcripts but I keep getting some warning/error that make me doubt the output.

1) cfg file creation:

bam2cfg.pl -g -h -q 40 -v 3.5 Tumor/kayrotypic.bam Normal/kayrotypic.bam > out.cfg

I keep getting the following errors (not sure if that's an error tho):

[Mon Oct 26 17:06:03 2015 bam2cfg.pl] Processing bam: Results_RSN_01926694/kayrotypic.bam
[Mon Oct 26 17:06:10 2015 bam2cfg.pl] $recordcounter > $expected_max: 30001 > 30000
[Mon Oct 26 17:06:10 2015 bam2cfg.pl] Closing BAM file
[Mon Oct 26 17:06:10 2015 bam2cfg.pl] Send TERM signal for 28546
[Mon Oct 26 17:06:12 2015 bam2cfg.pl] samtools pid process 28546 is still there...
[Mon Oct 26 17:06:12 2015 bam2cfg.pl] invoking kill -9 on 28546 ...
[Mon Oct 26 17:06:12 2015 bam2cfg.pl] Closing samtools process : 28546
[Mon Oct 26 17:06:13 2015 bam2cfg.pl] Processing bam: Results_RSN_01926692/kayrotypic.bam
[Mon Oct 26 17:06:21 2015 bam2cfg.pl] $recordcounter > $expected_max: 30001 > 30000
[Mon Oct 26 17:06:21 2015 bam2cfg.pl] Closing BAM file
[Mon Oct 26 17:06:21 2015 bam2cfg.pl] Send TERM signal for 28563
[Mon Oct 26 17:06:23 2015 bam2cfg.pl] samtools pid process 28563 is still there...
[Mon Oct 26 17:06:23 2015 bam2cfg.pl] invoking kill -9 on 28563 ...
[Mon Oct 26 17:06:23 2015 bam2cfg.pl] Closing samtools process : 28563

the cfg file is created and looks like this:

readgroup:RS-01926694    platform:Illumina    map:Tumor/kayrotypic.bam    readlen:101.00    lib:lib1    num:5684    lower:0.00    upper:8696.40    mean:484.47    std:922.33    SWnormality:minus infinity    flag:0(38.70%)18(37.81%)2(4.72%)20(0.57%)32(6.23%)4(0.01%)64(11.97%)8(0.01%)30001    exe:samtools view
readgroup:RS-01926692    platform:Illumina    map:Normal/kayrotypic.bam    readlen:101.00    lib:lib1    num:6826    lower:0.00    upper:12653.07    mean:506.75    std:1212.53    SWnormality:minus infinity    flag:0(25.52%)1(0.01%)18(45.25%)2(6.16%)20(0.51%)32(4.16%)4(0.02%)64(18.37%)30001    exe:samtools view

however, because of the error I don't know if that's correct

2) running breakdancer:

breakdancer-max -t -q 10 -f -d out.ctx out.cfg > out.ctx

 

it runs in a few minutes while the website says it should take from 12 hours to a few days 

the output looks legit:

#Software: 1.4.5-unstable-60-3876c5f (commit 3876c5f)
#Command: breakdancer-max -t -q 10 -f -d BR9193.ctx BR9193.cfg 
#Library Statistics:
#Results_RSN_01926691/kayrotypic.bam    mean:495.81    std:1063.22    uppercutoff:10865.8    lowercutoff:0    readlen:101    library:lib1    reflen:3010442741    seqcov:3.07041    phycov:7.53633    32:1003520
#Chr1    Pos1    Orientation1    Chr2    Pos2    Orientation2    Type    Size    Score    num_Reads    num_Reads_lib    kayrotypic.bam    kayrotypic.bam
chr1    233971929    114+20-    chr2    264915    7+0-    CTX    -495    99    7    Results_RSN_01926691/kayrotypic.bam|7
chr1    233971929    114+20-    chr2    271874    10+0-    CTX    -495    99    10    Results_RSN_01926691/kayrotypic.bam|10
chr1    233971929    114+20-    chr2    272041    44+0-    CTX    -495    99    43    Results_RSN_01926691/kayrotypic.bam|43
chr1    233971929    114+20-    chr2    272221    18+0-    CTX    -495    99    18    Results_RSN_01926691/kayrotypic.bam|18

but I do get the following warning/error:

WARNING: at line 2, library lib1 overwritten!
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf
chi squared problem: N=2, log(p)=-inf, -2*log(p) = inf

which makes me doubt the output too...

also, the paper indicates that the distribution of insert size should be normal-like, while my distribution is heavily skewed on the left, like most RNASeq reads data...

so..I'm kinda of confused on how to interpret those results!

any help would be great,

thanks :)

breakdancer rnaseq • 794 views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by TriS3.1k

I do get "overwritten" warning, but no other messages.

However, i have also noticed that it takes about an hour (if you execute all command, not just -t parameter) or so to run break-dancer. You should definitely filter based on score, default value is -h 30. Similarly for -q, 35 is default where you have used 10, may be you want to change that.

How do you plan to identify high confidence CTX (translocation)?

ADD REPLYlink written 2.1 years ago by Chirag Nepal1.9k

I was going to keep CTXs in the ctx file with confidence score > 90

for the cfg file quality filter, yes, default is -h 30 but I kept getting the error of 

$recordcounter > $expected_max: 30001 > 30000

which went away if I used a higher quality filter. I will try to use -q 35 and see what happens. thanks for the input

ADD REPLYlink written 2.1 years ago by TriS3.1k

What command did you actually use to remove this error ? Now, i am getting this error in some of my libraries.

Previously, i was using, bam2cfg.pl -g -h $P1T $P1N > P1.cfg

 

ADD REPLYlink written 2.1 years ago by Chirag Nepal1.9k

I used the code in the post

ADD REPLYlink written 2.1 years ago by TriS3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1259 users visited in the last hour