Why Is There A Sequence With Name ">Hs37D5" In The Reference Hs37D5.Fa File Of Gatk Resource
1
6
Entering edit mode
10.3 years ago
Griffan ▴ 90

What is this sequence? It looks strange with such a recursion.

reference gatk hg19 • 12k views
ADD COMMENT
12
Entering edit mode
10.3 years ago

From the README file, it's "the concatenated decoy sequences," which would explain why it's just one contig. BTW, it's equivalent to the hs37d5cs.fa.gz file, if you're curious. If you're wondering where all of the sequences map, have a look at the various hs37d5ss.X files. Also, have a look at this blog post for some context surrounding the whole decoy genome thing (see also the pdf file from Heng Li, showing the benefits to using a decoy genome).

ADD COMMENT

Login before adding your answer.

Traffic: 2637 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6