Human transcriptome download
2
0
Entering edit mode
6.7 years ago
KVC_bioinfo ▴ 590

Hello,

I am going to align the RNA -seq data to human transcriptome. However, I am not sure which database I should use. NCBI's RefSeq or RefSeqGene or anything else?

Can anyone help me with that?

Thank you in advance

RNA-Seq transcriptome • 8.5k views
ADD COMMENT
1
Entering edit mode

You should use the whole genome for alignment and then use a GFF file to do your counting.

ADD REPLY
0
Entering edit mode

We plan to use Human transcriptome. I will also need the gtf file for that. from where I can get that?

ADD REPLY
1
Entering edit mode

While you could get that data from multiple places, Illumina has bundles that contains matched sequence, annotation and index files for bowtie2/bwa hosted at iGenomes site for many genomes, including human.

ADD REPLY
0
Entering edit mode

Yes. But it is human genomes. I am looking to download Human transcriptome.

ADD REPLY
1
Entering edit mode

If you are referring to a set of transcript sequences (minus the introns/non-coding regions) then Ensembl Human Genome page is as a good place as any. Look under "gene annotation" on right side.

ADD REPLY
0
Entering edit mode

II found this

And from that page, i downloaded RefSeq Transcripts. Is it correct? Also, I need annotation file when aligning with STAR. Could you please tell me how do I get that? thank you very much.

ADD REPLY
0
Entering edit mode

Make this easy on yourself. Follow the directions to get pre-made indexes for STAR: C: Pre made STAR Index?

ADD REPLY
0
Entering edit mode

I looked into it. it does not have a pre made an index for human transcriptome

ADD REPLY
0
Entering edit mode

Have you looked at STAR manual? It may be good to spend some time and go through it.

ADD REPLY
3
Entering edit mode
6.5 years ago

Just use GENCODE's reference transcriptome FASTA:

https://www.gencodegenes.org/releases/current.html

[Direct link to gzipped FASTA: ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_27/gencode.v27.transcripts.fa.gz]

I have done this for hundreds of RNA-seq samples

ADD COMMENT
2
Entering edit mode
6.7 years ago
Tom_L ▴ 350

I recommend you to use the information available in the Table Browser from UCSC. Pick your genome version (hg19 or hg38), choose your annotations (Ensembl, RefSeq, etc.) and get the GTF output format. RefSeq is a good starting point. If you need a transcriptome fasta file, you can use the gtf_to_fasta tool available in the TopHat2 package. You will be able to use this file with many aligners (not restraint to TopHat2).

ADD COMMENT

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6