Dear All,
For the first time I wanted to try kallisto for gene expression quantification of RNA-seq data (bacterial strain). I noticed that 5 genes shared the same number of reads mapping across all samples (n = 36). The same behavior was observed for other tRNA genes
Gene_1,49,49.4,71.2,80.6,62.4,61.8,52.6,68.2,105.2,118.6,113.2,117.6,98.8,90.8,133.2,102.6,97.2,100.2,115,139,103.2,84,82,59.6,104.8,112,63,67.6,112.8,95.6,87.6,68.2,81
Gene_2,49,49.4,71.2,80.6,62.4,61.8,52.6,68.2,105.2,118.6,113.2,117.6,98.8,90.8,133.2,102.6,97.2,100.2,115,139,103.2,84,82,59.6,104.8,112,63,67.6,112.8,95.6,87.6,68.2,81
Gene_3,49,49.4,71.2,80.6,62.4,61.8,52.6,68.2,105.2,118.6,113.2,117.6,98.8,90.8,133.2,102.6,97.2,100.2,115,139,103.2,84,82,59.6,104.8,112,63,67.6,112.8,95.6,87.6,68.2,81
Gene_4,49,49.4,71.2,80.6,62.4,61.8,52.6,68.2,105.2,118.6,113.2,117.6,98.8,90.8,133.2,102.6,97.2,100.2,115,139,103.2,84,82,59.6,104.8,112,63,67.6,112.8,95.6,87.6,68.2,81
Gene_5,49,49.4,71.2,80.6,62.4,61.8,52.6,68.2,105.2,118.6,113.2,117.6,98.8,90.8,133.2,102.6,97.2,100.2,115,139,103.2,84,82,59.6,104.8,112,63,67.6,112.8,95.6,87.6,68.2,81
This is how the expression matrix was exported:
files <- file.path(base_dir, "kallisto", samples$sample, "abundance.h5")
names(files) <- paste0("sample", 1:36)
txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)
write.table(txi.kallisto$counts, file = "countData")
Each gene encode for a tRNA-Glu
, they are found on different chromosomal location (I have the full genome sequence), and all share the same sequence.
Since I am building a co-expression network, what should I do?
Thank you for your time!
Andrea
Thank you Michael,
No offense taken, I am not a tRNA researcher. I think I am going to collapse identical sequences and run the network analysis again.
Best Andrea