What does a 'normal' FASTQC report look like for scRNA-seq data?
0
0
Entering edit mode
4.7 years ago

I'm interested specifically in 'failures' I got on R2 (R1 being just the barcode + UMI). Note this is 10X sincle cell RNA-sequencing (scRNA-seq) data.

Test 'failures':

  • Per tile sequence quality (quality appears slightly worse at about 54bp (the middle of the read), with a corresponding increase in Ns; I'm not sure how to interpret this, having read this)
  • Per base sequence content (at start of read only, for this reason I think)
  • Per sequence GC content (there is a shifted peak to the left, i.e. AT-rich)
  • Sequence Duplication Levels (I guess due to highly expressed genes)
  • Overrepresented sequences (no hits; up to 1%)
  • Kmer Content (again at start of read only)

Thank you for your help in interpreting this!

RNA-Seq scRNA-seq fastqc fastq 10x • 3.5k views
ADD COMMENT
0
Entering edit mode

I have generally not bothered to run FastQC on 10x data. Once you do the analysis using 10x software you can start paying attention to the metrics (no of cells, reads per cell etc).

You could run FastQC on data 10x makes available on their site to see how your data compares.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion! I will do that.

Perhaps my question is too broad, but the thing that concerns me the most is that there is 1-2bp with slightly lower quality, although not bad enough to fail the "Per base sequence quality" test. These correspond to red blocks on the "Per tile sequence quality" and a small spike in the "Per base N content". Is this something to be concerned about?

I've ran CellRanger and the downstream metrics you mention such as reads per cell etc. seem okay

ADD REPLY
0
Entering edit mode

If you had a few bad tiles or N's in places those sequences should have been taken care by STAR, which cellranger users to do the alignments. If your run metrics look okay in CellRanger then you are probably fine to proceed.

ADD REPLY
0
Entering edit mode

Thanks! I think the run metrics are okay, but "Reads Mapped to [the mouse] Genome" is 90% for some samples and 80% for others, which is lower than in the 10x example data you linked to where it's over 95% (e.g. here), would that concern you?

ADD REPLY
0
Entering edit mode

Test data provided by 10x is likely ideal. So it is not terribly surprising that yours does not look as good. 90% alignments is not bad but this is something you will need to judge yourself. I will see if one of the other mods with more experience with user data wants to chime in.

ADD REPLY
0
Entering edit mode

It's a bit easier to judge the results if you provide the actual figures. From what you've been describing, neither FastQC nor CellRanger summaries are raising any red flags.

ADD REPLY

Login before adding your answer.

Traffic: 2021 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6