Deleted:zero TPM for many gene including housekeeping when using kallisto
0
0
Entering edit mode
15 days ago
ashkan ▴ 160

I have some paired end RNAseq data and trying to use kallisto to get counts per gene. when I get the abundance.tsv file, but for many genes including housekeeping genes I see 0 (TPM) which is not normal. to test if the data has problem or not I used STAR aligner (and human genome) and followed by HTSeq. then I saw the housekeeping genes have high counts. so my conclusion is that data is fine and the problem maybe is from the kallisto command or reference files (which was human transcriptom) I used. here is the command of kallisto that I used:

kallisto index -i Homo_sapiens.GRCh38.cdna.all.idx Homo_sapiens.GRCh38.cdna.all.fasta.gz

kallisto quant -i Homo_sapiens.GRCh38.cdna.all.idx -o output -b 100 reads_1.fastq.gz reads_2.fastq.gz

I got the cDNA reference from the Ensembl(Homo_sapiens.GRCh38.cdna.all.fasta) and used that to build index file.

do you know what the problem could be? here is few lines for problematic gene:

target_id   length  eff_length  est_counts  tpm
ENST00000415118.1   8   9   0   0
ENST00000448914.1   13  1   0   0
ENST00000434970.2   9   10  0   0
ENST00000631435.1   12  13  0   0
ENST00000632684.1   12  13  0   0
ENST00000710614.1   16  4   0   0
ENST00000605284.1   17  5   0   0
ENST00000604642.1   23  11  0   0
ENST00000603077.1   31  7.33333 0   0
ENST00000229239.10  1285    1110.69 0   0
ENST00000604102.1   31  7.33333 0   0
ENST00000603693.1   19  7   0   0
ENST00000604950.1   31  7.33333 0   0
RNAseq • 369 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1099 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6