Question: low mappability in single cell rna-seq data?
0
gravatar for christacaggiano
15 months ago by
UCSF
christacaggiano20 wrote:

Hi,

Using the STAR aligner, I am getting a very low mapping percentage for my single cell RNA seq data (5-10%). A majority of my reads are being considered "too short" (>90%). My current parameters are STAR --genomeDir --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 --outReadsUnmapped Fastx --outSAMstrandField intronMotif --readFilesCommand zcat --readFilesIn *.fq.gz --runThreadN 6

I am also trimming the reads with trim galore as follows: trim_galore $R2_file --trim-n -a AAAAAAAA -clip_R1 9 -o $dir_name

Is there any hypothesis for why we are getting such low percentage of mapped reads? I am particularly interested in assessing contamination. Is there a good software for just quickly assessing whether my samples could be contaminated? I have no good idea with what they could be contaminated with.

Thanks!

ADD COMMENTlink modified 15 months ago • written 15 months ago by christacaggiano20

One should never trim reads independently (if you have paired end data). You are also not scanning/removing Illumina adapters.

ADD REPLYlink written 15 months ago by genomax65k

My presumption is that this is something like CEL-Seq2 data and OP is trying to remove polyadenylation from read 2 (if it's still there then it'll get soft-clipped, so I think that's excess effort). If that's the case, read 1 is mostly polyA plus UMI/cell barcode, which I imagine is causing mapping issues.

ADD REPLYlink written 15 months ago by Devon Ryan89k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 928 users visited in the last hour