Question: Build a Kallisto transcriptome index
0
gravatar for F. Golestan
5 weeks ago by
F. Golestan20
F. Golestan20 wrote:

Hello,

I need to pseudo-align my paired reads to the transcriptome using Kallisto. I know that Kallisto does not use a reference genome sequence, and instead it performs pseudo-alignment to determine the compatibility of reads with targets (e.g. transcript sequences).

However, to determine the compatibility of reads with target transcript sequences (to build a Kallisto transcriptome index), how can I choose my targeted reference transcriptome which is human and also Cassava Brown Streak Virus?

I mean, for running the below codes to create the Kallisto index from the transcriptome, should I specify which transcriptome I want to use (e.g. for human or for Cassava Brown Streak Virus)? If so, how to know what is the appropriate transcriptome that I should use for my targeted genomes?

cd
kallisto index -i Potra01-mRNA.idx \
~/share/Day01/data/reference/fasta/Potra01-mRNA.fa.gz

Thank you so much for your advise and guide. Best wishes

ADD COMMENTlink modified 5 weeks ago by Lior Pachter330 • written 5 weeks ago by F. Golestan20
0
gravatar for Lior Pachter
5 weeks ago by
Lior Pachter330
United States
Lior Pachter330 wrote:

It sounds like your goal is to build an index from both the human and the Cassava Brown Steak Virus at the same time. You can do this by obtaining the transcriptomes for each separately, and then building an index using both files: kallisto index -i name.idx human.fa.gz cassava_brown_steak_virus.fa.gz. You can then quantify reads against both simultaneously.

ADD COMMENTlink written 5 weeks ago by Lior Pachter330
1

The actual fasta files can be downloaded from public data bases such as Ensembl, as described here and here. You want to look for the cDNA bit in the file name since you want to limit yourself to those parts of the genome that refer to the transcribed loci.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Friederike4.9k

Thanks a lot Friederike for your guide. After downloading transcriptome fasta files, then, the name of fasta file would be for fa.gz file? what about name.idx?

Many thanks.

ADD REPLYlink written 5 weeks ago by F. Golestan20
1

I believe Lior was just trying to indicate that you can put whatever name you want the resulting index to have following --i.

I.e., if you want two indeces, one for the human, one for the virus cDNA libraries, you will run the command twice:

kallisto index -i my_human_index.idx name_of_the_fasta_file_for_the_human_cDNA_collection.gz # generates the index to be used with the human samples

kallisto index -i my_virus_index.idx name_of_the_fasta_file_for_the_virus_cDNA_collection.gz # generates the index to be used with the virus samples

ADD REPLYlink written 5 weeks ago by Friederike4.9k

Many thanks Friederike. I could find fasta files for human and also plants, and I did indexing for them. However, I could not find transcriptome fasta file for Cassava Brown Streak Virus or its close species (TAN70 virus). I would highly appreciate if you can help me from where I can get it.

Many thanks.

ADD REPLYlink written 4 weeks ago by F. Golestan20
1

Sorry, I've never had to download a viral cDNA index, so I'd have to resort to the usual tools (google etc.) just like you.

ADD REPLYlink written 4 weeks ago by Friederike4.9k

OK. Thank you very much Friederike.

ADD REPLYlink written 4 weeks ago by F. Golestan20

Thank you very much Lior. In fact, I want to build an index from both the human and the Cassava Brown Steak Virus separately. I have two different RNA-seq datasets (one for human and another one for Cassava Brown Steak Virus). I need to know how can I obtain transcriptomes for human and the Cassava Brown Steak Virus separately?

Then, I want to know what should I exactly write for name.idx and both fa.gz files for human and the Cassava Brown Steak Virus separately?

Many thanks for the help.

ADD REPLYlink written 5 weeks ago by F. Golestan20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour