Hello,
I run alignment on my samples using kallisto. I want to do Deseq2 afterwards and I have a question about converting the transcript ids to gene names. I am using biomart and tximport to convert the transcript ids. I am doing the below. Am I supposed to convert to gene names with tximport on the first step or am I supposed to convert to ensembl gene IDs on the first step and after doing DESeq2 add another column in the res data frame at the end with the external gene names? If the second way is correct can I use the gene names to plot a heat map of the differentially expressed genes? Is there any chance there will be duplicate gene names?
> txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene =
tx2gene, ignoreTxVersion = TRUE)
sampleTable <- data.frame(condition
> = factor(c("a","a","a","b","b","b")) rownames(sampleTable) <- colnames(txi.kallisto.tsv$counts)
dds <-DESeqDataSetFromTximport(txi.kallisto.tsv, sampleTable, ~condition)
> dds <- DESeq(dds) dds$condition <- relevel(dds$condition, ref = "b")
> dds <- DESeq(dds)
> res <- results(dds)
Thank you
Thank you so much for replying. Would you do the paste instead of tximport? When I use biomart I specify the version I want to use to make sure that it matches my index. When you do the volcano plots or heatmaps after the deseq2 do you use the gene names or gene IDs in the figures? Most papers I have read seem to have gene names but now if there are duplicates I am not sure how I would deal with that.