Question: transcriptome index for Kalisto
1
gravatar for krushnach80
23 months ago by
krushnach80420
krushnach80420 wrote:

How to build a transcriptome index for kalisto is it from the normal hg19 or hg 38 ?the way we build for tophat protocol ?

rna-seq • 2.0k views
ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 23 months ago by krushnach80420

Have you read the manual?

ADD REPLYlink written 23 months ago by WouterDeCoster34k

yes its saying target sequence so i have that confusion the target sequence is my reference file or my read file pardon me for this trivial doubt but Im new to all these stuffs so do guide I plan to use this for my rna seq data from HL60 ,human cell lines.

ADD REPLYlink written 23 months ago by krushnach80420

The target sequence is supposed to be a fasta transcriptome reference. Note that you can also download common transcriptome indexes.

ADD REPLYlink modified 23 months ago • written 23 months ago by WouterDeCoster34k

Thank you i have downloaded and Im building the index now

ADD REPLYlink written 23 months ago by krushnach80420

so can I use my own transcriptome reference im using hg19 so i have created the fasta file ,now I want to use that to build the index but it seems it taking for ever i started two hours ago its stuck at kmer sequence does it take so much time or im doing something wrong in the index building im giving my input as .fa file not .gz file so is that an issue?

ADD REPLYlink written 23 months ago by krushnach80420

You downloaded transcriptome fasta and are using that for building the index? Might take a while indeed, don't remember how long it took for me. (Punctuation would make your question easier to read.)

ADD REPLYlink modified 23 months ago • written 23 months ago by WouterDeCoster34k

yes i have " downloaded transcriptome fasta and are using that for building the index? " yes i did that for test but I want to have my own transcriptome index which would be from hg19 so would that work?

ADD REPLYlink written 23 months ago by krushnach80420

If you used an fasta file containing all transcripts of interest for building the index, sure, that would work.

ADD REPLYlink written 23 months ago by WouterDeCoster34k

im using this http://hgdownload.cse.ucsc.edu/goldenpath/hg19/chromosomes/ to download al the files and concatenated it into a single fasta file , so can I use that for Kalisto index building? i did try its like stuck for more than an hour .So am I doing the right thing? do give your suggestion and have a look at the link

ADD REPLYlink written 23 months ago by krushnach80420
1

You are downloading entire chromosomes, the genome. That's not the same as the transcriptome. An example would be: ftp://ftp.ensembl.org/pub/grch37/release-86/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh37.cdna.all.fa.gz (from Ensembl)

ADD REPLYlink written 23 months ago by WouterDeCoster34k

but for tophat I have used the whole chromosome to align rna seq data.So am I doing it wrong?

ADD REPLYlink written 23 months ago by krushnach80420
3

Indeed, Tophat uses the genome for alignment while kallisto uses the transcriptome for pseudomapping and counting..

ADD REPLYlink modified 23 months ago • written 23 months ago by WouterDeCoster34k

thank you for the clarification ...

ADD REPLYlink written 23 months ago by krushnach80420

it seems it taking for ever i started two hours ago its stuck at kmer sequence

I got the same problem, and in my case, the problem was that there was a corrupted fastq file. The kallisto author is aware of this problem (in fact, I sent him my files and he discover the reason), and is trying to fix it. You can test if your file is corrupted by running a zcat with your fastq compressed files

ADD REPLYlink written 21 months ago by Antonio R. Franco3.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 833 users visited in the last hour