Question: Different reference human genome
5 weeks ago by
v.shapovalova110 wrote:


I've not worked with human reads before so the questions are about the reference genome: -- -Genome sequence, primary assembly (GRCh38) chromosomes and scaffolds - 194 contigs -- from Heng Li 195 contigs , there is a decoy contig with EPV circular chromosome.

  1. When should I use reference with 639 contigs? with patches and haplotypes
  2. I have the same single-end reads. I've mapped them to the reference genome from gencode(194 contigs) and to the reference genome Heng Li(195 contigs) with the same default bwa. And i have a bit different results of coverage for some chromosomes.

                     Number of mapped reads
       chr         Gencode reference        Heng Li reference
        1          4770                         4771
       13          2183                         2189
        X          1651                         1684

Is it normal or should I find a mistake? I supposed that the sequences of these assemblies were the same except decoy genome. There were not reads mapped to the decoy genome. The sum difference of mapped reads is 40 (59647 and 59687).

A lot of thanks, Valery

These differences look pretty minor and should not affect your end results. What is it that you are trying to do finally?

Thank you for reply! This a 'trial' experiment with small amount of reads. But I suppose that final step in the 'real' experiment will be SNP calling.

