Question: How to download human reference transcriptome (hg19)
2
gravatar for yyuanshengliu
21 months ago by
yyuanshengliu20 wrote:

As the title. I do not known how to download human reference transcriptome. Who can help me.

Is the website http://hgdownload.soe.ucsc.edu/goldenPath/hg19/?

sequencing rna-seq next-gen • 6.4k views
ADD COMMENTlink modified 7 months ago by Kristin Muench380 • written 21 months ago by yyuanshengliu20

Hi! Follow up to this thread - has any managed to download a matching .gtf file? I am trying to download a transcriptome for use in creating .bam files in kallisto to be viewed using IGV, but the .bam files I'm producing don't show any reads when I upload them to the program, and I suspect it's because the .gtf and .fa files were downloaded from different places.

ADD REPLYlink written 7 months ago by Kristin Muench380

Matching GRCh37 human GTF file can be found here.

ADD REPLYlink written 7 months ago by genomax64k
2
gravatar for h.mon
21 months ago by
h.mon24k
Brazil
h.mon24k wrote:

There are several versions of the human transcriptome. One often used is ENSEMBL, to get protein coding genes and non-coding RNA:

wget ftp://ftp.ensembl.org/pub/release-89/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz
wget ftp://ftp.ensembl.org/pub/release-89/fasta/homo_sapiens/ncrna/Homo_sapiens.GRCh38.ncrna.fa.gz
ADD COMMENTlink written 21 months ago by h.mon24k

Thanks very much.

I am reading a paper. It written "a human reference transcriptome derived from hg19 build of human genome" and "This transcriptome contains 214294 transcripts and occupied 96446089 bytes as a gzipped FASTA file". Maybe the version your provided is not identical the paper mentioned?

ADD REPLYlink modified 21 months ago • written 21 months ago by yyuanshengliu20
1

The links provided by h.mon state CRCh38, which is indeed not the same as hg19.

ADD REPLYlink written 21 months ago by WouterDeCoster37k
1

hg19 transcripts from Ensembl archive are available here or as wget ftp://ftp.ensembl.org/pub/release-67/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh37.67.cdna.all.fa.gz.

ADD REPLYlink modified 21 months ago • written 21 months ago by genomax64k
1

Most certainly it is not, but I would have to know the folder where the transcriptome is saved to be certain for sure.

"A human reference transcriptome derived from hg19 build of human genome" and "this transcriptome contains 214294 transcripts and occupied 96446089 bytes as a gzipped FASTA file" are only moderately useful to describe a transcriptome. But if the manuscript you are referring to is this paper, then it doesn't mater because:

1) the version of the transcriptome is unlikely to play a substantial roe in their results 2) you can use any different human transcriptome to encode / decode fastq files, provided you use the same transcriptome for both encoding / decoding the same files.

ADD REPLYlink written 21 months ago by h.mon24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 944 users visited in the last hour