CRAM reference registry and the GRch38 reference genome
1
0
Entering edit mode
3.0 years ago

Hi. I'm downloading some CRAM files from 1000genomes for use in variant calling. I will convert them to BAM (since most tools can't call from CRAM files). I'm a bit confused about how to go about this process since I assumed a straightforward Samtools based conversion based on a given reference FASTA.

What's the deal with the reference registry and why do I require it ?

I went through the README document on the FTP site but I'm still quite confused

I've already downloaded the hg38 FASTA from Human Genome Resources

CRAM BAM Samtools GRCh38 VariantCalling • 1.6k views
ADD COMMENT
2
Entering edit mode
3.0 years ago
h.mon 33k

I've already downloaded the hg38 FASTA from Human Genome Resources

You have to download the files referenced by the checksums found on the cram headers. To decompress the cram files, you need exactly the same reference as used for compression. To ensure the correct reference is used, the 1000genomes cram files contain the identity of the reference contigs used for compression, this identity is given by MD5 or SHA1 checksums.

ADD COMMENT

Login before adding your answer.

Traffic: 1439 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6