Seeking Explanation for the "Secondbest" Method in Gene log2FC Calculations with CRISPR Data in MAGeCK
1
0
Entering edit mode
8 months ago
yanbinwan • 0

I am currently analyzing CRISPR screen data and have some questions regarding the calculation of gene log2FC values. The mageck tool provides three methods to calculate gene log2FC from multiple sgRNAs: mean, median, and secondbest. My confusion lies with the secondbest method—what is the rationale or validity for using this method? I do not quite understand why it is used. Could anyone shed light on the basis or rationale behind the secondbest method for calculating log2FC?

enter image description here

CRISPR log2FC analysis mageck gene • 663 views
ADD COMMENT
1
Entering edit mode
8 months ago
ATpoint 89k

CRISPR (and shRNA) data are very noisy, and individual hits (individual = a single guide) can have very large effect sizes due to some bias or off targets. That is why we use multiple guides per gene. The challenge now is to aggregate all guides per gene into a single logFC. The obvious choices are mean and median.

Now, the 2nd best option can be used to avoid reporting the best hit, which as said above can be very large, for whatever reason, so using second best is a somewhat conservative approach to avoid excessively large logFCs. If the 2nd best hit is still very large then you have increased confidence that this effect size is real and not due to some sort of bias, for example in the most extreme case an insertion into some sort of tumor or growth suppressor, or due to some sort of amplification bias.

ADD COMMENT

Login before adding your answer.

Traffic: 3149 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6