Question: low mappability in single cell rna-seq data?
gravatar for christacaggiano
2.5 years ago by
christacaggiano50 wrote:


Using the STAR aligner, I am getting a very low mapping percentage for my single cell RNA seq data (5-10%). A majority of my reads are being considered "too short" (>90%). My current parameters are STAR --genomeDir --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 --outReadsUnmapped Fastx --outSAMstrandField intronMotif --readFilesCommand zcat --readFilesIn *.fq.gz --runThreadN 6

I am also trimming the reads with trim galore as follows: trim_galore $R2_file --trim-n -a AAAAAAAA -clip_R1 9 -o $dir_name

Is there any hypothesis for why we are getting such low percentage of mapped reads? I am particularly interested in assessing contamination. Is there a good software for just quickly assessing whether my samples could be contaminated? I have no good idea with what they could be contaminated with.


ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by christacaggiano50

One should never trim reads independently (if you have paired end data). You are also not scanning/removing Illumina adapters.

ADD REPLYlink written 2.5 years ago by genomax85k

My presumption is that this is something like CEL-Seq2 data and OP is trying to remove polyadenylation from read 2 (if it's still there then it'll get soft-clipped, so I think that's excess effort). If that's the case, read 1 is mostly polyA plus UMI/cell barcode, which I imagine is causing mapping issues.

ADD REPLYlink written 2.5 years ago by Devon Ryan95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 872 users visited in the last hour