Question: cn.mops - Error in if (n > 512) n <- 2^ceiling(log2(n))
1
gravatar for emmasreid
5.7 years ago by
emmasreid20
United Kingdom
emmasreid20 wrote:

Hi all,

Firstly, apologies if this is a daft/duplicated question - I am very new to this forum and data analysis using R in general.

I am trying to analyse some gene panel data using the cn.mops package in R using the code provided under Section 6 of the package vignette ("Exome sequencing data"). The first part of the code seems to work with no errors:

> BAMfiles <-c("~/Alignments/Filename.bam","~/Alignments/Filename2.bam","~/Alignments/Filename3.bam","~/Alignments/Filename4.bam","~/Alignments/Filename5.bam","~/Alignments/Filename6.bam","~/Alignments/Filename7.bam")

> segments <- read.table("~/Alignments/NeuroMetabolicCodingExons.bed",sep="\t",as.is=TRUE)

> gr <- GRanges(segments[,1],IRanges(segments[,2],segments[,3]))

> X <- getSegmentReadCountsFromBAM(BAMfiles,GR=gr,mode="unpaired")

However, when I then enter the following command as instructed:

> resCNMOPS <- exomecn.mops(X)

It comes up with the following error:

Normalizing...
Error in if (n > 512) n <- 2^ceiling(log2(n)) : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In normalizeChromosomes(X, chr = chr, normType = normType, qu = normQu) :
  There exists a reference sequence with zero reads for some samples.

If anyone could enlighten me as to what this error means/how to fix it I would be eternally grateful.

Many thanks. 

ADD COMMENTlink modified 5.7 years ago by Devon Ryan94k • written 5.7 years ago by emmasreid20
1

Hello,

Sorry for the inconveniences with the package. Indeed, the problem comes from chromosomes that have predominantly zeros in the read count matrix. Typically the median read count is used to calculate normalization factors...

I am now catching this case and I am returning an error message saying that people should remove this samples or chromosomes (usually, chromosome "Y").

 

For copy number detection on chromosomes X and Y, I suggest to run cn.mops separately for males an females. Please do not hesitate to contact me, if you encounter problems with the package.

 

The updated version should be on Bioconductor in a few days.

 

Regards,

Günter Klambauer

 

 

 

ADD REPLYlink written 5.3 years ago by klambauer20

Thanks for updating the package!

ADD REPLYlink written 5.3 years ago by Devon Ryan94k

I am sorry but it still gives the same error. The worst thing is that you have to make all the calculations again after trying to exclude some scaffolods with near zero alignments. Could you make it to write error log and automatically exclude bad chromosomes while working on all others without stopping?

Thank you very much!

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by Andrey Yurchenko10
0
gravatar for Devon Ryan
5.7 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

The warning message is actually the cause of the error. In short, exomecn.mops() calls normalizeChromosomes(), which in turn iterates over all of the chromosomes, of which one seems to lack reads. Thus, there's a step where a matrix is subset by chromosome, producing a 0-dimensional matrix. Various things are done to that (e.g., median), which all produce NA. The actual error ends up being caused by normalizeChromosomes() calling density() with n=NA (since n depends on the aforementioned matrix). I would suspect that fixing that warning message will solve the error.

ADD COMMENTlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by Devon Ryan94k

Thank you so much for your swift reply! I think I understand your explanation of the problem - any ideas on the command(s) I should use instead to circumvent this problem?

(Apologies again for my ignorance - I've been struggling with this for days now...!)

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by emmasreid20

You should be able to use seqlevels(X) to both see and set the chromosomes that will be looked at. If you table(seqlevels(X)), you'll see which chromsome has no alignments, so just remove that one.

ADD REPLYlink written 5.7 years ago by Devon Ryan94k

Thank you! So I would use that code to determine which chromosome needs to be removed, and then I assume this line of code would need to be inserted somewhere:

> normalizeChromosomes(X, chr, normType = "poisson", qu = 0.25, ploidy)

Any chance you could hazard a guess as to where I should put this (or a similar) line? From what I've read, neither the exomecn.mops() nor the cn.mops() command seem to have the ability to specify particular chromosomal regions. From what I understand, these are the lines which determine the regions of interest:

> segments <- read.table("~/Alignments/NeuroMetabolicCodingExons.bed",sep="\t",as.is=TRUE)
> gr <- GRanges(segments[,1],IRanges(segments[,2],segments[,3]))

But I think I would still need to specify the .bed file, and I can't seem to see an easy way to do this and also exclude certain regions. Hope that made sense?

Thanks again for all your help.

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by emmasreid20

exomecn.mops() calls normalizeChromosomes(), so I don't think you'll need to add that line anywhere. The list of all chromosomes is held in seqlevels(X), so you can just change that:

table(seqlevels(X)) #Just look at which one has 0 reads. Supposing it's the second one...
seqlevels(X) <- seqlevels(X)[-2]
ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by Devon Ryan94k
1

Hello! Just an update - I did as you suggested and did table(seqlevels(X)). This gave a value of '1' for all chromosome:

chr1 chr10 chr11 ...etc.
    1     1     1      ...etc.

I then tried the head(X[,1:3]) command to see if the counts are even being called correctly at all and got this:

GRanges with 6 ranges and 3 metadata columns:
      seqnames               ranges strand | Sample_1.sorted.bam Sample_2.sorted.bam
         <Rle>            <IRanges>  <Rle> |                  <integer>                  <integer>
  [1]     chr1 [76190472, 76190502]      * |                          0                          0
  [2]     chr1 [76194085, 76194173]      * |                          0                          0
  [3]     chr1 [76198328, 76198426]      * |                          0                          0
  [4]     chr1 [76198537, 76198607]      * |                          0                          0
  [5]     chr1 [76199212, 76199313]      * |                          0                          0
  [6]     chr1 [76200475, 76200556]      * |                          0                          0

chr1  chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9 ... chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22
      NA    NA    NA    NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA    NA    NA    NA

So clearly something has gone very wrong somewhere...! It seems to be choosing the correct regions from my .bed file but clearly not getting any counts - would you think this is a problem with my .bam files or the code?

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by emmasreid20
1

Hi emmasreid,

I was struggling with the very same problem for a few days now and I think I came over it a few hours ago... simply switch to mode="paired" in getSegmentReadCountsFromBAM and you will eventually get read count values in "X" for your samples. A look into the Table S3 of this paper lead me to this as they also use mode="paired" on 16 apparently unrelated Exomes.

Hope that helps!

cheers Max

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by schmoo30

This worked perfectly - thank you so much Max! Finally manged to pick up some CNVs after days of struggling. Thank you again for solving my problem!

Cheers,
Emma

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.7 years ago by emmasreid20
1

I used mode="paired" in getSegmentReadCountsFromBAM and still am getting this error.

This is how my seqlevels look like:

> seqlevels(X)
 [1] "chr1"  "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2"  "chr20" "chr21" "chr22" "chr3"  "chr4"  "chr5"  "chr6"  "chr7"  "chr8"  "chr9"  "chrX"  "chrY"
> table(seqlevels(X))

chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19  chr2 chr20 chr21 chr22  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chrX  chrY
    1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1

Can someone please help me solve this error?

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 5.0 years ago by mkaushal110
1

Try changing the normalization method, i.e.:

res <- cn.mops(X, normType="mean")

Default is normType="poisson"

Hope that helps.

ADD REPLYlink modified 5 weeks ago by RamRS25k • written 4.9 years ago by Ettore Z20

Ah, that's informative. My guess would be that the chromosome names in the BAM and BED files are different (e.g., one uses Ensemble chromosome names and the other UCSC names). Have a quick look at a few reads in the BAM file and some positions in the BED file to check this. If that's not the problem, have a look at both files together in IGV or another browser.

ADD REPLYlink written 5.7 years ago by Devon Ryan94k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour