I am working on using MAGeCK to define essential genes from a CRISPR screen on model MPNST cell lines using the Brunello library. Counts and normalization proceed perfectly fine, I am able to get a counts table that looks like so:
sgRNA Gene ST8814_Final ST8814_Initial STS26T_Final STS26T_Initial NEIL1_49554_TGGAAGGGGGCACTTAGCGA NEIL1 1339 1144 936 1036 TEN1_75497_GACCTATTACCTCCCCTGGG TEN1 500 803 479 593 CAND1_43519_AGTCTAGGGCTGGTCAACTG CAND1 1247 1208 1151 1356 PKD1L1_64018_CAAGAAGCAATACCATCGGG PKD1L1 867 1110 1181 874 Non-Targeting-Control_76949_CTTACGCGCCTGGTCAAAAG Non-Targeting-Control 962 597 605 576 SPANXN2_73423_CATCAATCCAGTCCAAGAGG SPANXN2 1719 1629 2208 1757 ENC1_22140_TGAAAGTCTGTCCTCCCAGA ENC1 1178 960 1702 1250 C6orf99_75339_TGTCCATCTGCTCTTCCGAG C6orf99 1421 1158 1242 1162 TMED6_61599_GGCAGGTTCAGCGGACAGTG TMED6 1167 864 1301 824 TMEM178B_75964_TATCACGAAGACCATCCGTC TMEM178B 3402 1247 1060 996 TMEM205_70361_ACTAGTCCGAAGGTATGTCG TMEM205 379 302 282 243 ...
However, the output of the mle module outputs results for STS26T perfectly fine (comparing Final to Initial), but outputs the following results for ST88-14:
Gene sgRNA ST88-14_final|beta ST88-14_final|z ST88-14_final|p-value ST88-14_final|fdr ST88-14_final|wald-p-value ST88-14_final|wald-fdr STS26T_final|beta STS26T_final|z STS26T_final|p-value STS26T_final|fdr STS26T_final|wald-p-value STS26T_final|wald-fdr NEIL1 4 0 nan 0 0 nan nan -0.0091435 -0.097767 0.87909 0.98842 0.92212 0.97488 TEN1 4 0 nan 0 0 nan nan 0.070497 0.43913 0.50536 0.90936 0.66057 0.88065 CAND1 4 0 nan 0 0 nan nan 0.02579 0.21461 0.71067 0.96525 0.83007 0.9478 PKD1L1 4 0 nan 0 0 nan nan 0.16576 1.5978 0.20201 0.74051 0.11008 0.46527 Non-Targeting-Control 1000 0 0 0 0 1 3822.6 0 0 0.76262 0.97277 1 1 SPANXN2 4 0 nan 0 0 nan nan 0.021365 0.20247 0.73348 0.96835 0.83955 0.95029 ENC1 4 0 nan 0 0 nan nan 0.46622 2.9077 0.0039764 0.088681 0.0036408 0.11948 C6orf99 4 0 nan 0 0 nan nan 0.029127 0.62869 0.69434 0.96222 0.52955 0.81888 TMED6 4 0 nan 0 0 nan nan -0.055822 -0.35504 0.89248 0.98933 0.72256 0.9063 TMEM178B 4 0 nan 0 0 nan nan -0.14222 -0.8204 0.52938 0.91757 0.41199 0.74758 TMEM205 4 0 nan 0 0 nan nan 0.27052 0.93014 0.053785 0.42638 0.3523 0.70946 RASA4 4 0 nan 0 0 nan nan -0.19407 -0.95469 0.36954 0.86055 0.33973 0.69843 CD72 4 0 nan 0 0 nan nan 0.1884 2.1184 0.15351 0.66773 0.034139 0.30462 RERGL 4 0 nan 0 0 nan nan 0.094596 0.53689 0.41082 0.87566 0.59135 0.85102 GNG7 4 0 nan 0 0 nan nan -0.17606 -1.1091 0.41935 0.88024 0.26738 0.64323 MAGEA10-MAGEA5 4 0 nan 0 0 nan nan -0.22891 -1.7663 0.28536 0.81425 0.077347 0.41225 ...
Does anyone know why this would be the case? The QC metrics for the sequencing, evenness of counts, and dropout of sgRNAs look great, so I'm confused as what is going wrong on my end. For reference, the commands and design matrix for the mle are as shown:
mageck mle -k MPNST.count.txt -d MPNST_design.txt -n MPNST_mle
Samples baseline ST8814_final STS26T_final ST8814_Final 1 1 0 ST8814_Initial 1 0 0 STS26T_Final 1 0 1 STS26T_Initial 1 0 0
Any help is appreciated. Thank you!
Do you have controls like non-targeting and true killing controls? Make a plot that shows the groups on the x-acis and estimated β value on y-axis and color it by guide category, so a color for non-targeting, killing and all other guides. See whether NTCs are neutral and killers indeed kill as a first troubleshooting. Maybe a normalization issue. Also try MAGeCK RRA to get individual per-guide logFCs for this diagnostic.