Question

Compute bin boundaries from a received script only shows chromosome 1 instead of all

0

Entering edit mode

8.0 years ago

lien ▴ 90

Hi all,

I downloaded a python script that was part of the Supplementary data from the paper 'Genome-wide copy number analysis of single cells'. Nature protocols 2012, Baslan et al. This script computes bin boundaries for the entire human genome and takes 3 different files as input. I have checked these files, and they look ok. The output file I get only lists boundaries for (part of) chromosome 1 and not for all chromosomes. I've looked at the script, but cannot seem to find where it is going wrong. I've added a link to the script with the Gist below:

Any thoughts will be appreciated, Thanks.

python script compute bin boundaries • 1.5k views

ADD COMMENT • link 8.0 years ago by lien ▴ 90

0

Entering edit mode

Does hg19.goodzones.bwa.k50.bed contain more than the first chromosome and is it sorted to match the order of hg19.chrom.sizes.txt?

ADD REPLY • link 8.0 years ago by Devon Ryan 104k

0

Entering edit mode

hg19.goodzones.bwa.k50.bed contains all the chromosomes of the genome (1-22, X and Y).

I have sorted the 3 files that are used as input:

sort hg19.chrom.sizes.txt > hg19.chrom.sizes.sort.txt
sort hg19.goodzones.bwa.k50.bed > hg19.goodzones.bwa.k50.sort.bex
sort hg19.chrom.mappable.bwa.k50.txt > hg19.chrom.mappable.bwa.k50.sort.txt

The python script seems to start out correct, but then at one point it looks like it just stops writing to the output file. Available free disk space is not an issue, as there is more than 2TB still available.

ADD REPLY • link 8.0 years ago by lien ▴ 90

0

Entering edit mode

You could try adding a few print("1") sorts of statements inside each instance of:

if goodEOF:
    print("1") # or some other number. make them different each time
    break

That'd allow you to at least see where it's breaking its loop over regions.

ADD REPLY • link 8.0 years ago by Devon Ryan 104k

0

Entering edit mode

I see you already added statements like print chromarray, what's the output of those?

ADD REPLY • link 8.0 years ago by WouterDeCoster 47k

0

Entering edit mode

I get a lot (!) of the errors below:

ERROR: Past end of chrom. chr11 chr1
ERROR: Past end of chrom. chr11 chr1
ERROR: Past end of chrom. chr11 chr1
ERROR: Past end of chrom. chr11 chr1
ERROR: Past end of chrom. chr11 chr1

So I guess there must be something wrong with the input files I'm using, but I have checked these already and they look okay.

ADD REPLY • link 8.0 years ago by lien ▴ 90