Question: Obtaining input matrix from Kallisto for later downstream analysis with Deseq2 or EdgeR.
7 months ago by
obuhovadarina0 wrote:

Dear All!

I am very new to bioinformatics and trying to connect phenotype of certain cells with expression of certain genes (all exhibiting the same function). I downloaded FASTA sequences for all genes of interest from NCBI and RNA-seq data from cells. I trimmed adapters from obtained RNA sequences and run Kallisto quant on it. Now I am trying to obtain the matrix from Kallisto to further downstream input.

I am looking at this tutorial However, I am confused with this part

files <- file.path(dir, "kallisto_boot", samples$run, "abundance.h5")

As the authors use tximportData package, I am not sure what files should "kallisto_boot" include. Transcripts? Abundance.tsv and json file?

I cannot help with kallisto as I do not use it but the index it expects is an entire transcriptome, not just a collection of selected genes. Download a reference transcriptome, either from NCBI/RefSeq or Gencode, then run kallisto, then tximport, normalize data with a tool of your choice, e.g. edgeR or DESeq2 and then do whatever downstream analysis you plan to do. I assume that this kallisto_boot is a column in the output files of the tool that tximport expects. Simply run the example code and see if it works.

