Question: Question regarding for sciClone when doing a 2d plot
0
gravatar for sxl919
4.6 years ago by
sxl91910
United States
sxl91910 wrote:

Hi,

I think I have ran into a problem when I was using sciClone in order to generate a 2d plot. I have my bam files which have been processed by varscan somatic and I got the snp and indel files. After doing somatic filter and processsomatic, I put the data in somatic file to sciClone. As I have 2 samples, I instruct the program to read 2 set of snv data and copynumber data.The data format for the snv file looks like this

Chromosome      position   tumor_read1   tumor_read2   vaf_precentage

Chr 1                   10011             8               4                        30

Chr 2                   12331             6               3                        25

...

Chr Y

However, when I using sciClone in order to generate the plot and cluster file, I got the following error:

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 7, 0
In addition: Warning message:
In max(marginalClust[[i]]$cluster.assignments, na.rm = T) :
  no non-missing arguments to max; returning -Inf

I am really new to R and if I have asked a stupid question, please forgive me.

Thanks 

2d plot sciclone R • 1.7k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by sxl91910

Please provide the following info, and I'll be happy to take a look:  1) the exact R commands that you're trying to run,  2) the first few lines of your input files

ADD REPLYlink written 4.6 years ago by Chris Miller21k

Thank you Chris,

R-code:

library(sciClone)

#read in vaf data from three related tumors
#format is 5 column, tab delimited: 
#chr, pos, ref_reads, var_reads, vaf

v1 = read.table("Sample1_recal.bam-0209.snp.somaticfiltered.Somatic",header=T);
v1 =  v1[,c(1,2,9,10,24)]
v2 = read.table("Sample2_recal.bam-0209.snp.somaticfiltered.Somatic",header=T);
v2 =  v2[,c(1,2,9,10,24)]  #to get the chromosome, starting position, Reads supporting reference in tumor, Reads supporting variant in tumor, and the Variant allele frequency in tumor*100 as the origin vaf is in precentage and sciclone accept 1-100 but not 1%-100%

cn1 = read.table("Sample1_recal.bam-0209-1.copynumber",header=T);
cn1 = cn1[,c(1,2,3,7)]
cn2 = read.table("Sample2_recal.bam-0209-1.copynumber",header=T);
cn2 = cn2[,c(1,2,3,7)]  #to get the chromosome, starting position, ending position and the log2 ratio

names = c("Sample1", "Sample2")
sc = sciClone(vafs=list(v1,v2),
              copyNumberCalls=list(cn1,cn2),
              sampleNames=names[1:2],
              cnCallsAreLog2=TRUE,
              minimumDepth=10,
              maximumClusters=7)
writeClusterTable(sc, "results/clustersSample1_vs_Sample2-0210")
sc.plot1d(sc,"results/clustersSample1_vs_Sample2-0210_1d.pdf")
sc.plot2d(sc,"results/clustersSample1_vs_Sample2-0210_2d.pdf")

 

The first few lines of my input files are
SNV for sample 1:

chrom position ref var normal_reads1 normal_reads2 normal_var_freq normal_gt tumor_reads1 tumor_reads2 tumor_var_freq tumor_gt somatic_status variant_p_value somatic_p_value tumor_reads1_plus tumor_reads1_minus tumor_reads2_plus tumor_reads2_minus normal_reads1_plus normal_reads1_minus normal_reads2_plus normal_reads2_minus precentage
chr1 500630 A G 22 0 0% A 8 4 33.33% R Somatic 1 0.010674 0 8 0 4 2 20 0 0 33.33
chr1 1221916 T C 84 10 10.64% T 203 52 20.39% Y Somatic 1 0.02198 95 108 39 13 36 48 7 3 20.39
chr1 3001832 C A 20 5 20% M 36 39 52% M Somatic 1 0.004402 8 28 5 34 4 16 1 4 52
chr1 4888019 T C 22 4 15.38% T 23 22 48.89% Y Somatic 1 0.004173 1 22 1 21 1 21 0 4 48.89

and for copynumber sample1

chrom chr_start chr_stop num_positions normal_depth tumor_depth log2_ratio gc_content
chr1 10920 12948 29 10.9 11.8 0.11 48.3
chr1 11959 12977 19 10.4 9.5 -0.13 63.2
chr1 12999 13079 100 19.7 16.2 -0.278 63

 

However, I noticed that when I open the copynumber file the file cannot be completely load as it has way to many records (I use excel to open it) so will that be a possible reason to have those error messages?

Thanks

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by sxl91910

Is there any output above the error message you pasted?  Maybe something like: "ERROR: only N  points - not enough points to cluster when using 7 intialClusters. Provide more data or reduce your maximumClusters option"

ADD REPLYlink written 4.6 years ago by Chris Miller21k

 I use 10, which is the default and 7 for the cluster, it shows
ERROR: only 7  points 0 not enough points to cluster when using 10 intialClusters. Provide more data or red\nuce your maximumClusters option" (I miss that, sorry)

ADD REPLYlink written 4.6 years ago by sxl91910
0
gravatar for Chris Miller
4.6 years ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

Okay, based on the info you've provided above, the problem is that you only have 7 cop-ynumber neutral points with adequate depth in this sample. If that's accurate, then you're not going to be able to get any kind of meaningful clonality analysis from 7 points.

You might also double check your data to see if it's a mistake. If this tumor isn't heavily CN-altered and has adequate depth for most points, then you should make sure that your CN values are correct and possibly tweak the copyNumberMargins param.

ADD COMMENTlink written 4.6 years ago by Chris Miller21k

Thank you very much. I will double check the data 

ADD REPLYlink written 4.6 years ago by sxl91910
0
gravatar for sxl919
4.6 years ago by
sxl91910
United States
sxl91910 wrote:

Hi Chris,

Sorry to bother you again. I actually found a similar post in Biostar
sciClone error for two sample

In that post, you said we have to  merge the calls and pull readcounts for every site in both samples in order to generate the 2d plot so the input file should look like this
 

Tumor 1:
    1    111   10  10   50.0
    1    999    6    6   50.0
    2    222    9    9    50.0
    2    888    5    5    50.0

    Tumor 2:
    1    111    4    4    50.0
    1    999    8    8    50.0
    2    222    3    3    50.0
    2    888    7    7    50.0


Yet, how can I do that? Since if I use compare (with the option merge) in Varscan I will just end up with 2 sample calls merge together, which look like this

 

1    111   10  10   50.0
1    999    8   8    50.0
2    222    9    9    50.0
2    888    7   7    50.0

By the way, my calls was generated by doing the comparison between normal and tumor1 + normal and tumor 2 

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by sxl91910

You have to actually assemble a new list, by merging both of your lists, then go back to the bams and pull new readcounts, using a program like bam-readcount (https://github.com/genome/bam-readcount)

ADD REPLYlink written 4.6 years ago by Chris Miller21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour