low mappability in single cell rna-seq data?
0
0
Entering edit mode
6.2 years ago
ccagg ▴ 60

Hi,

Using the STAR aligner, I am getting a very low mapping percentage for my single cell RNA seq data (5-10%). A majority of my reads are being considered "too short" (>90%). My current parameters are STAR --genomeDir --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 --outReadsUnmapped Fastx --outSAMstrandField intronMotif --readFilesCommand zcat --readFilesIn *.fq.gz --runThreadN 6

I am also trimming the reads with trim galore as follows: trim_galore $R2_file --trim-n -a AAAAAAAA -clip_R1 9 -o $dir_name

Is there any hypothesis for why we are getting such low percentage of mapped reads? I am particularly interested in assessing contamination. Is there a good software for just quickly assessing whether my samples could be contaminated? I have no good idea with what they could be contaminated with.

Thanks!

RNA-Seq contamination alignment • 2.4k views
ADD COMMENT
0
Entering edit mode

One should never trim reads independently (if you have paired end data). You are also not scanning/removing Illumina adapters.

ADD REPLY
0
Entering edit mode

My presumption is that this is something like CEL-Seq2 data and OP is trying to remove polyadenylation from read 2 (if it's still there then it'll get soft-clipped, so I think that's excess effort). If that's the case, read 1 is mostly polyA plus UMI/cell barcode, which I imagine is causing mapping issues.

ADD REPLY

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6