RNA-Seq: Getting Started with Kallisto
3
1
Entering edit mode
3.0 years ago
arussell3483 ▴ 30

Hello,

I am relatively new to bioinformatics and RNA-sequencing and am working on developing a workflow for my sequencing project. I am planning to use Kallisto and Sleuth as part of the analysis, but I am not sure how to get started. Will the quality control and trimming take place before I run the fasta file through kallisto? If so, does this step also take place in the Linux terminal?

Thanks!

RNA-Seq • 5.0k views
3
Entering edit mode
3.0 years ago

Your reads would be in FASTQ format, not FASTA, though you will possibly have a transcriptome/genome reference in FASTA format.

A typical workflow I would say is something like:

• Start with by getting some FASTQ files, have them in separate directories per sample (this is something that's done easily with terminal/bash commands, good opportunity to get familiar with that.)
• You may run a QC tool (like FastQC + MultiQC) on the raw data, determine if you need to apply trimming
• Apply trimming, re-run QC tools
• Align your reads to a reference (giving you a BAM file) using an aligner (some quantification tools like RSEM allow handling of this step implicitly). Kallisto (which I haven't really used) can perform quantification without alignment as described here (https://pachterlab.github.io/kallisto/starting.html)
• Quantify (for example, running Kallisto, or RSEM)

Most likely yes, that would all happen on the terminal. I would suggest looking up some tutorials or existing pipelines using Kallisto. This one looks like a good start: https://felixeyegithubio.readthedocs.io/en/latest/rnaseq/labs/kallisto/

0
Entering edit mode

Thank you for your reply, this gives me some good jumping off points!

3
Entering edit mode

If it helps..up vote the answer!

2
Entering edit mode
3.0 years ago

People typically run FASTQ files to generate counts rather than FASTA files. Yes, QC and trimming would be done before hand. Use FastQC to look at all of your FASTQ files. It will tell you if you need to trim any adaptors or if there are any other QC issues. It has both a simple GUI and a command-line version if you only have access to a headless linux server.

Trimming can be done with any number of tools, but Trim Galore is pretty popular and easy to use (and made to work with FastQC). It is run from the command line.

0
Entering edit mode

Okay, I will do some more research into FastQC, thank you for clearing that up! I'm not familiar with Trim Galore yet, but I will definitely check it out.

2
Entering edit mode
3.0 years ago
Lior Pachter ▴ 610

See https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth for a useful Snakemake workflow.

0
Entering edit mode

Thank you! With this workflow, will I be able to identify differential gene expression? I understand that kallisto quantifies transcript level abundance - can I use the methods described in your 2018 paper (Gene-level differential analysis at transcript-level resolution) in addition to this workflow? I will be working with an annotated reference transcriptome, but there is no sequenced genome.

0
Entering edit mode

Scripts for performing the gene-level differential analysis from the paper you cited are here: https://github.com/pachterlab/aggregationDE You should be fine with the annotated reference transcriptome as there is no need for a genome for this to work.