Question: Multi-allelic CNV (Beginner's level question)
11 days ago by
jimkozubek20 wrote:

I am looking at some CNV information, which I did not run, but obtained from the MSSNG database. And after reading the documentation, I still have a question.

The field CopyNumber sometimes lists values with just one number such as 0 or 3, but sometimes it looks biallelic, such as 3|4 or just 0|1 But wait, there’s more. Sometimes the value for a CopyNumber is: 3|5|4|7

How do you interpret this?

9 days ago by
Kevin Blighe39k
Republic of Ireland
Kevin Blighe39k wrote:

It is likely related to the fact that the copy number variant information for MSSNG was derived from 2 sources:

  1. Illumina platforms
  2. Complete Genomics

The Illumina data was processed with ERDS and CNVnator - these are likely the single digit copy number values that you see. They are whole integers because a Hidden Markov Model was used.

The Complete Genomics data is haplotype-resolved, so, the 3|4, etc, likely relates to copy number 3 on one allele, and 4 on the other.

I cannot comment on the other ones, such as 3|5|4|7

For a full clarification, I would encourage you to contact the team directly - email found here:


Thank you. The HMM insight sounds correct.

I am not sure these split values are due to various sequencing platforms, because in this case each line is a specific sample, and only one sequencing platform is identified per sample, such as each line says CG or HiSeqX, and only one per line.

So, 3|4 value is for only one line with one sequencing platform. The same is true for 3|5|4|7

I think it is possible that their are ALT contigs in v38 and it could be due to that.

I agree that MSSNG should be the best place to find out, since it is their nomenclature, and will post their response, if they do get back to me.

Here is an example such that Copy Number is listed as 3|5|4|7 for only one sample and only one platform HISEQX. I have posed the question to MSSNG

Sample  Chromosome      Start   End     CNVType CopyNumber      Size    Overlap Putative_Inheritance    GC_Content_Percent   CytobandAnn      Gene_Symbol     Gene_egID       Exon_Symbol     Exon_egID       CDS_Symbol      CDS_egID        ISCA_region  CNV_ISCA_percOverlap     ExAC_pLI        UncleanGenome_percOverlap       MPO_NervousSystem       HPO_NervousSystem       CGD  OMIM_MorbidMap   DECIPHER_region CNV_decipher_percOverlap        DGV_N_studies   DGVpercFreq_subjects_allStudies DGVpercFreq_su
bjects_coverageStudies  DGV_percOverlap_any     DGV_50percRecipOverlap  CGparentalPercFreq_50percRecipOverlap   erdsPercFreq_5
0percRecipOverlap       cnvnatorPercFreq_50percRecipOverlap     Comment Curated Platform

SampleXYZ       21      9964001 11188000        DUP     3|5|4|7 1224000 0.838|1.000     no_parent       30.7    21p11.2|21p11.1       BAGE4|TPTE|BAGE2|BAGE|BAGE3|BAGE5       7179|85318|574  BAGE4|TPTE|BAGE2|BAGE|BAGE3|BAGE5       7179|85318|574  BAGE4|TPTE|BAGE2|BAGE|BAGE3|BAGE5     7179|85318|574          0               86.357                          7179:TPTE:|85318:BAGE3:|574:BAGE:             0       0       0       0       40.041  0       0       99.822  97.724  -               HISEQX
