Difference in the following reference genomes, both from Broad Institute FTP?
0
1
Entering edit mode
7.0 years ago
ari.nazarian ▴ 10

I did some alignment and BQSR using the Homo_sapiens_assembly19.fasta that I found at ftp://ftp.broadinstitute.org/pub/seq/references.

I only later discovered the GATK resource bundle at ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/ and the human_g1k_v37_decoy.fasta found there.

Right now I'm trying to do Indel realignment. I was wondering if theoretically I could switch over to the human_g1k_v37_decoy.fasta reference at this point? Is there a difference between the references, because the Homo_sapiense_assembly19.fasta isn't like the hg19.fa file I downloaded from UCSC, in that its chromosomes are labeled "n" vs "chr n" and contains the "GL" sequences too. So I'm guessing it's more like b37, but don't know if it's actually identical to the one that's in the bundle.

I'll probably stick to the original reference I used for the rest of my analysis with the sample I'm working with since I'm only still trying to pave-out my analysis pipeline, but I'm also wondering if in the future I should be using the human_g1k_v37_decoy.fasta file instead.

reference alignment assembly • 2.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 2826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6