which gtf file I can use?
13 months ago
yueli7 ▴ 210

Hello,

I use star to make human index.

There are four gtf files:hg38.refGene.gtf.gz, hg38.ncbiRefSeq.gtf.gz, hg38.knownGene.gtf.gz and hg38.ensGene.gtf.gzin: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/

Which one I can use to make human index?

Thanks in advance foe any help!

Best,

Yue

Choice is often arbitrary. People often use either what they stumbled over first or what they find most appealing towards the formatting of the files and gene names or what a colleague advised them to use, which in turn is probably based on what the colleague stumbled over first or found most appealing.

GENCODE is more comprehensive then RefSeq. It contains more transcripts, especially non-coding ones. So if you are interested in lncRNA GENCODE might be a better choice. RefSeq is more conservative and contains fewer transcripts.

See for a more in-depth comparison e.g. https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-16-S8-S2

Hello, ATpoint,

Thank you

Yue