Deleted:Guidance Needed on scRNA-seq QC Adaptive & Fixed Thresholds
0
0
Entering edit mode
11 months ago
cfa24357 ▴ 10

Hi,

I've been stuck with the QC of a 10X scRNA-seq dataset in our lab. We've previously performed adaptive thresholding on other datasets using the 'scater' package and I generally prefer that to setting fixed arbitrary thresholds. However for this particular dataset, adaptive thresholding failed for two out of ten samples.

Briefly, we have 4 conditions (Control, Treatments 1-3) with 2-3 timepoints (D5, D7, D9 or D7 & D9) for each group. Cells from Treatments 2 & 3 are expected to have a phenotype between Control (ie healthy) and Treatment 1. Adaptive thresholding failed for two samples: Treatment-1_D9 and Treatment-3_D9. The treatments are quite stressful on the cells so pushing the system till D9, we do expect potentially a high number of cell death in these two samples. QC plots attached below.

enter image description here

enter image description here

enter image description here

enter image description here

My question is, what QC strategy would be appropriate here? Options I am considering are:

  1. Apply adaptive thresholding on all samples, except the two that failed. Apply fixed arbitrary threshold on the samples Treatment1_D9 and Treatment-3_D9.
  1. Fixed thresholds, use different threshold for each group.

Control

  • Library size < 4500
  • No. of expressed genes < 2500
  • Percentage of reads mapping to mitochondrial transcripts > 10%

Treatment 1-3

  • Library size < 500
  • No. of expressed genes < 1000
  • Percentage of reads mapping to mitochondrial transcripts > 25%
  1. Fixed thresholds, use the same thresholds for all groups.

    • Library size < 500
    • No. of expressed genes < 1000
    • Percentage of reads mapping to mitochondrial transcripts > 25%

I ran a preliminary analysis using option 3 as that was what my colleagues were pushing for as they did not want to risk losing any novel rare cell types, given that we do not know anything about the cell populations in these treatment conditions.

However, at the clustering stage of this preliminary analysis, I am left with a couple of clusters of ambiguous quality (clusters 6,7&8 in violin plots below). These clusters also had uninformative marker genes (some non-coding RNA, some RPL-, RPS- genes, MALAT1, MT- genes), which may be due to either cell quality or the biology of the treatments.

enter image description here

I suspect some cells in these clusters are low-quality given the rather permissive thresholds we applied using option 3. However it is unclear if a particular cluster should be excluded completely. As such, I am currently reviewing the QC thresholds again and wanted to get some thoughts from the hive here on whether options 1 & 2 would be appropriate?Based on the literature and general recommendations is to apply different metric thresholds to each sample individually, which was what I still strongly feel should be done here, hence why I am leaning towards option 1. Any feedback on this would be much appreciated :) Thanks!

scRNA-seq single-cell • 317 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 3083 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6