How Do I Interpret The Output Generated By Variationhunter?
1
1
Entering edit mode
12.4 years ago
Bioscientist ★ 1.7k

I'm using Variationhunter to analyze CNV; and when I used hg18 reference genome sometime ago, I usually got the output as:

IL5_286:1:311:10:40    chr1    142188880    142188915    R    141736787    141736822    F    deletion    3    23.930555    0.00257801939733326435
IL5_286:1:317:831:353    chr16    28530262    28530297    R    28517264    28517299    F    deletion    0    26.263889    1.00000000000000000000

The second column tells at which chromosome the potential CNV lies.

Now when I switch to humang1kv37.fasta (one version of hg19), I get output like below:

HWI-ST150_0129:3:47:9776:140941    1 dna:chromosome chromosome:GRCh37:1:1:249250621:1    144308983    144309081    F    143689458    143689556    F    V    4    60    2.254486e-12
HWI-ST150_0129:3:4:4776:183027    GL000225.1 dna:supercontig supercontig::GL000225.1:1:211173:1    6803    6901    F    102087    102185    F    V5    64    1.502623e-12

We can see the format difference is: where used to be "chr#" is replaced by three columns as "1 dna:chromosome chromosome:GRCh37:1:1:249250621:1"

So what's this "1 dna:chromosome chromosome:GRCh37:1:1:249250621:1"? Is it a header indicating chromosome # or sth? Also I can grep this "header" from humang1kv37.fasta; but seems human_hg18.fastq doesn't contain such header-like stuff.

Also "L000225.1 dna:supercontig supercontig::GL000225.1:1:211173:1" is quite confusing to me. What does "L000225.1" standard for? Some unusual part of chromosome? (like mitonchondrial chromosome?)

Thanks a lot.

• 3.3k views
ADD COMMENT
5
Entering edit mode
12.4 years ago
Bert Overduin ★ 3.7k

chromosome:GRCh37:1:1:249250621:1

means that the sequence is

  • a chromosome
  • from the GRCh37 assembly
  • chromosome number 1
  • from basepair 1
  • until basepair 249250621
  • forward strand

in other words, the complete chromosome 1

GL000225 is a supercontig representing a dna sequence of which we don't know where it fits into the assembly. See also EMBL-Bank or GenBank.

Note that there also are supercontigs that represent alternate haplotypes e.g. GL000255.

ADD COMMENT

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6