Question

Deleted:zero TPM for many gene including housekeeping when using kallisto

0

Entering edit mode

15 days ago

ashkan ▴ 160

I have some paired end RNAseq data and trying to use kallisto to get counts per gene. when I get the abundance.tsv file, but for many genes including housekeeping genes I see 0 (TPM) which is not normal. to test if the data has problem or not I used STAR aligner (and human genome) and followed by HTSeq. then I saw the housekeeping genes have high counts. so my conclusion is that data is fine and the problem maybe is from the kallisto command or reference files (which was human transcriptom) I used. here is the command of kallisto that I used:

kallisto index -i Homo_sapiens.GRCh38.cdna.all.idx Homo_sapiens.GRCh38.cdna.all.fasta.gz

kallisto quant -i Homo_sapiens.GRCh38.cdna.all.idx -o output -b 100 reads_1.fastq.gz reads_2.fastq.gz

I got the cDNA reference from the Ensembl(Homo_sapiens.GRCh38.cdna.all.fasta) and used that to build index file.

do you know what the problem could be? here is few lines for problematic gene:

target_id   length  eff_length  est_counts  tpm
ENST00000415118.1   8   9   0   0
ENST00000448914.1   13  1   0   0
ENST00000434970.2   9   10  0   0
ENST00000631435.1   12  13  0   0
ENST00000632684.1   12  13  0   0
ENST00000710614.1   16  4   0   0
ENST00000605284.1   17  5   0   0
ENST00000604642.1   23  11  0   0
ENST00000603077.1   31  7.33333 0   0
ENST00000229239.10  1285    1110.69 0   0
ENST00000604102.1   31  7.33333 0   0
ENST00000603693.1   19  7   0   0
ENST00000604950.1   31  7.33333 0   0

RNAseq • 369 views

ADD COMMENT • link 14 days ago by ashkan ▴ 160