Low pseudo-alignment rate with Kallisto
0
1
Entering edit mode
4.8 years ago

Hi,

I am new to bioinformatics and am trying to perform differential expression analyses on some mouse RNA-seq data. We performed Tru-Seq Strand Specific Large Insert RNA Sequencing - High Coverage (50M pairs) on the sample. I am now trying to pseudo-align the reads to the mouse transcriptome using Kallisto. I ran bam2fq to obtain two fasts files, and also generated a mouse reference transcriptome index from both Ensemble( Mus_musculus.GRCm38.cdna.all.fa) and UCSC Genome Browser (refMrna.fa.gz).

I ran kallisto using the following command: kallisto quant -i index -o output pairA1.fastq pairA2.fastq For all the samples, the resulting run_info.json output looks similar to the example below:

"ntargets": 42184, "nbootstraps": 0, "nprocessed": 73044298, "npseudoaligned": 33281349, "nunique": 19777682, "ppseudoaligned": 45.6, "punique": 27.1, "kallistoversion": "0.45.0", "index_version": 10,

I would really appreciate any help in troubleshooting this issue. Is it an issue with the data quality, or should I be running Kallisto with additional arguments (strand specific, etc.)

Thank you very much for your help and please let me know if I can provide any additional information.

kallisto • 2.8k views
ADD COMMENT
0
Entering edit mode

First I would check for rRNA contamination, there are several threads here discussing methods to do so (e.g. How to screen for rRNA and gDNA contamination in RNA-seq data? ). RSeQC can also give some useful diagnostics, but you will have to map to the genome to use it.

ADD REPLY
0
Entering edit mode

What are the other possibilities of getting low pseudo alignment rate if there are no/minimal contamination and the strandedness option has been correctly used?

ADD REPLY
0
Entering edit mode

Can you provide any follow-up, msubramanian1 ?

ADD REPLY

Login before adding your answer.

Traffic: 2996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6