Question: How to download human reference transcriptome (hg19)
2
gravatar for yyuanshengliu
2.5 years ago by
yyuanshengliu20 wrote:

As the title. I do not known how to download human reference transcriptome. Who can help me.

Is the website http://hgdownload.soe.ucsc.edu/goldenPath/hg19/?

sequencing rna-seq next-gen • 8.9k views
ADD COMMENTlink modified 15 months ago by Kristin Muench470 • written 2.5 years ago by yyuanshengliu20

Hi! Follow up to this thread - has any managed to download a matching .gtf file? I am trying to download a transcriptome for use in creating .bam files in kallisto to be viewed using IGV, but the .bam files I'm producing don't show any reads when I upload them to the program, and I suspect it's because the .gtf and .fa files were downloaded from different places.

ADD REPLYlink written 15 months ago by Kristin Muench470

Matching GRCh37 human GTF file can be found here.

ADD REPLYlink written 15 months ago by genomax75k
2
gravatar for h.mon
2.5 years ago by
h.mon28k
Brazil
h.mon28k wrote:

There are several versions of the human transcriptome. One often used is ENSEMBL, to get protein coding genes and non-coding RNA:

wget ftp://ftp.ensembl.org/pub/release-89/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz
wget ftp://ftp.ensembl.org/pub/release-89/fasta/homo_sapiens/ncrna/Homo_sapiens.GRCh38.ncrna.fa.gz
ADD COMMENTlink written 2.5 years ago by h.mon28k

Thanks very much.

I am reading a paper. It written "a human reference transcriptome derived from hg19 build of human genome" and "This transcriptome contains 214294 transcripts and occupied 96446089 bytes as a gzipped FASTA file". Maybe the version your provided is not identical the paper mentioned?

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by yyuanshengliu20
1

The links provided by h.mon state CRCh38, which is indeed not the same as hg19.

ADD REPLYlink written 2.5 years ago by WouterDeCoster42k
1

hg19 transcripts from Ensembl archive are available here or as wget ftp://ftp.ensembl.org/pub/release-67/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh37.67.cdna.all.fa.gz.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax75k
1

Most certainly it is not, but I would have to know the folder where the transcriptome is saved to be certain for sure.

"A human reference transcriptome derived from hg19 build of human genome" and "this transcriptome contains 214294 transcripts and occupied 96446089 bytes as a gzipped FASTA file" are only moderately useful to describe a transcriptome. But if the manuscript you are referring to is this paper, then it doesn't mater because:

1) the version of the transcriptome is unlikely to play a substantial roe in their results 2) you can use any different human transcriptome to encode / decode fastq files, provided you use the same transcriptome for both encoding / decoding the same files.

ADD REPLYlink written 2.5 years ago by h.mon28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 866 users visited in the last hour