Question: Problems After Loading Varscan 2 Output Into Bioconductor.Dnacopy
0
gravatar for sousuffer
5.7 years ago by
sousuffer20
sousuffer20 wrote:

I am a bit new to this, but I was able to generate copy number calls (and GC-adjusted calls). I moved towards the segmentation step of the workflow and, sucessfully importing the file in the BioConductor DNAcopy package using the command:

cn=read.table("varScan.copynumber.called",header=F)

I am having trouble executing the next step:

CNA.object <-CNA(genomdat = cn[,6], chrom = cn[,1], maploc = cn[,2], data.type = 'logratio') 
Error in CNA(genomdat = cn[, 6], chrom = cn[, 1], maploc = cn[, 2], data.type = "logratio") : genomdat must be numeric

I looked at cn [,6] and it contains entries such as:

[99889] 34.8 19.0 46.8 55.1 56.6 52.6 
[99895] 29.7 44.4 33.8 30.6 21.0 40.7 
[99901] 42.4 46.5 65.4 98.7 82.5 83.6

These are numeric, so I cannot figure out what my problem is. Any help would be greatly appreciated.

ADD COMMENTlink modified 19 months ago by Biostar ♦♦ 20 • written 5.7 years ago by sousuffer20
1

Have you tried replacing cn[, 6] with as.numeric(cn[, 6]) ?

ADD REPLYlink written 5.7 years ago by Christof Winter950
0
gravatar for Chris Miller
5.7 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

Data types are a little confusing in R. Those numbers are probably being read in as factors. I'd try this:

#Do a quick sanity check to make sure the conversion works as expected:
head(as.numeric(cn[,6]),50)

#if that looks good, change your command to:
CNA.object <-CNA(genomdat = as.numeric(cn[,6]), chrom = cn[,1], maploc = cn[,2], data.type = 'logratio')

Alternatively, you could also specify the column types when you read the data in. Something like this:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")
ADD COMMENTlink written 5.7 years ago by Chris Miller20k

Hi Chris,

I also ran into the same problem as sousuffer. But as you mentioned, I change the "genomdat = cn[,6]" into "genomdat = as.numeric(cn[,6])" and the same with maploc. The problem has been solved.

Thank you very much.

ADD REPLYlink written 9 months ago by jinxinhao198860

Hi Chris,

By using the as.numeric I have tackle the error problem of "genomdat must be numeric". But after using the as.numeric, my results are very strange. The "seg.mean" value are all like several hundred, while I saw that other's result that this value should be around -3 to 3 or something like this. Do you have any suggestions that I can do to get a proper seg.mean value.

Here is my command lines: (mostly follow the lines you provided for the Varscan website)

cn<-read.table("varScan.copynumber2.called",header=F)
CNA.object <-CNA(genomdat = as.numeric(cn[,7]), chrom = cn[,1], maploc =as.numeric(cn[,2]), data.type = 'logratio')
CNA.smoothed <- smooth.CNA(CNA.object)
segs <- segment(CNA.smoothed, verbose=0, min.width=2)
segs2 = segs$output
write.table(segs2[,2:6], file="use.logratio.cn.out", row.names=F, col.names=T, quote=F, sep="\t")

Thank you for your time.

ADD REPLYlink written 9 months ago by jinxinhao198860

without seeing any of your data, there's really no way to tell, but one thing that jumps out is that you've replaced as.numeric(cn[,6]) with as.numeric(cn[,7]) Are you sure you're extracting the right columns?

ADD REPLYlink written 9 months ago by Chris Miller20k

Sorry.the image is my input file of DNAcopy. As you see, column 7 is the logratio number [url=https://ibb.co/jf3sVc][img]https://preview.ibb.co/dk7aix/TIM_20180205203444.jpg[/img][/url]

and this is the result of the DNAcopy. [url=https://imgbb.com/][img]https://image.ibb.co/fzJy3x/TIM_20180205203947.jpg[/img][/url]

As you can see that the seg.mean column value are several hundred.

ADD REPLYlink modified 9 months ago • written 9 months ago by jinxinhao198860
0
gravatar for sousuffer
5.7 years ago by
sousuffer20
sousuffer20 wrote:

I ran your command and got the below output:

head(as.numeric(cn[,6]),50) [1] 1667 880 983 916 983 998 922 915 906 934 221 895 953 820 553 [16] 663 886 995 903 222 908 1214 1414 1427 444 884 955 229 850 822 [31] 338 1106 886 1579 498 1515 853 868 902 665 920 1189 930 757 1409 [46] 405 1372 554 827 910

I'm not sure what implies "if that looks good", but I'm going to guess that since the values in numeric format do not equal the actual decimal values, this isn't what we want. I then tried the following as a comparison (pre-conversion):

head((cn[,6]),50) [1] tumordepth 21.3 31.6 24.9 31.6 33.1
[7] 25.5 24.8 23.9 26.7 12.0 22.8
[13] 28.6 18.0 15.2 16.5 21.9 32.8
[19] 23.6 12.1 24.1 54.7 74.7 76.0
[25] 14.3 21.7 28.8 12.8 19.4 18.2
[31] 13.7 43.9 21.9 91.2 144.7 84.8
[37] 19.7 20.2 23.5 16.7 25.3 52.2
[43] 26.3 17.4 74.2 136.4 70.5 15.3
[49] 18.7 24.3
1667 Levels: 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 100.0 ... tumor
depth

Finally, I tried the alternate code and got the following:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'chr_start'

I'm guessing the first line headers are messing this up?

Thanks!

ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by sousuffer20

In the first method, try as.numeric(as.character(cn[,6])). In the second, why did you set "header=F" if there is a header row? Change that to "header=T" and see if it works. Also make sure that the number of designations in colClasses matches up with the number of columns in the file you're reading

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by Chris Miller20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 851 users visited in the last hour