Creating genome alignment
1
0
Entering edit mode
6.2 years ago
dec986 ▴ 370

Hello,

I am confused about the difference between GRCh37.p13.genome.fa (767 MB, http://www.gencodegenes.org/releases/19.html) and GRCh37.primary_assembly.genome.fa (830 MB http://www.gencodegenes.org/releases/27lift37.html) which is lifted from Gencode release 27. I want to use the most recent version of GRCh37, with the most corrections/updates/etc. I can't use GRCh38 because of alt loci making accurate quantification difficult.

I can see pluses and minuses for each choice. which genome should I be using for STAR genome alignments?

Perhaps there is a version of GRCh37.p13.genome.fa which is lifted or related to release 27?

STAR RNA-Seq • 1.2k views
ADD COMMENT
0
Entering edit mode

Heng Li has a blog post on which human genome to use.

ADD REPLY
0
Entering edit mode
6.2 years ago
lshepard ▴ 470

From the Gencode website:

"Primary assembly: Nucleotide sequence of the GRCh38 (or GRCh37 if you want that) primary genome assembly (chromosomes and scaffolds) The sequence region names are the same as in the GTF/GFF3 files"

The larger file contains all regions including assembly patches and haplotypes. Normally, most people choose primary assembly. But to note, using the latest version (GRCh38) shouldn't really give you any issues with STAR and downstream analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 3022 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6