Question: RNA-Seq: Getting Started with Kallisto
1
gravatar for arussell3483
10 days ago by
arussell348310
arussell348310 wrote:

Hello,

I am relatively new to bioinformatics and RNA-sequencing and am working on developing a workflow for my sequencing project. I am planning to use Kallisto and Sleuth as part of the analysis, but I am not sure how to get started. Will the quality control and trimming take place before I run the fasta file through kallisto? If so, does this step also take place in the Linux terminal?

Thanks!

rna-seq • 233 views
ADD COMMENTlink modified 9 days ago by Lior Pachter290 • written 10 days ago by arussell348310
2
gravatar for manuel.belmadani
10 days ago by
Canada
manuel.belmadani920 wrote:

Your reads would be in FASTQ format, not FASTA, though you will possibly have a transcriptome/genome reference in FASTA format.

A typical workflow I would say is something like:

  • Start with by getting some FASTQ files, have them in separate directories per sample (this is something that's done easily with terminal/bash commands, good opportunity to get familiar with that.)
  • You may run a QC tool (like FastQC + MultiQC) on the raw data, determine if you need to apply trimming
  • Apply trimming, re-run QC tools
  • Align your reads to a reference (giving you a BAM file) using an aligner (some quantification tools like RSEM allow handling of this step implicitly). Kallisto (which I haven't really used) can perform quantification without alignment as described here (https://pachterlab.github.io/kallisto/starting.html)
  • Quantify (for example, running Kallisto, or RSEM)

Most likely yes, that would all happen on the terminal. I would suggest looking up some tutorials or existing pipelines using Kallisto. This one looks like a good start: https://felixeyegithubio.readthedocs.io/en/latest/rnaseq/labs/kallisto/

ADD COMMENTlink modified 10 days ago • written 10 days ago by manuel.belmadani920

Thank you for your reply, this gives me some good jumping off points!

ADD REPLYlink written 10 days ago by arussell348310
3

If it helps..up vote the answer!

ADD REPLYlink written 9 days ago by Lila M 780
2
gravatar for jared.andrews07
10 days ago by
jared.andrews072.7k
St. Louis, MO
jared.andrews072.7k wrote:

People typically run FASTQ files to generate counts rather than FASTA files. Yes, QC and trimming would be done before hand. Use FastQC to look at all of your FASTQ files. It will tell you if you need to trim any adaptors or if there are any other QC issues. It has both a simple GUI and a command-line version if you only have access to a headless linux server.

Trimming can be done with any number of tools, but Trim Galore is pretty popular and easy to use (and made to work with FastQC). It is run from the command line.

ADD COMMENTlink written 10 days ago by jared.andrews072.7k

Okay, I will do some more research into FastQC, thank you for clearing that up! I'm not familiar with Trim Galore yet, but I will definitely check it out.

ADD REPLYlink written 10 days ago by arussell348310
2
gravatar for Lior Pachter
9 days ago by
Lior Pachter290
United States
Lior Pachter290 wrote:

See https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth for a useful Snakemake workflow.

ADD COMMENTlink written 9 days ago by Lior Pachter290

Thank you! With this workflow, will I be able to identify differential gene expression? I understand that kallisto quantifies transcript level abundance - can I use the methods described in your 2018 paper (Gene-level differential analysis at transcript-level resolution) in addition to this workflow? I will be working with an annotated reference transcriptome, but there is no sequenced genome.

ADD REPLYlink written 4 days ago by arussell348310

Scripts for performing the gene-level differential analysis from the paper you cited are here: https://github.com/pachterlab/aggregationDE You should be fine with the annotated reference transcriptome as there is no need for a genome for this to work.

ADD REPLYlink written 4 days ago by Lior Pachter290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 927 users visited in the last hour