How does Cuffdiff calculate the p-values from Jensen-Shannon divergence - sqrt(JS) ?
0
1
Entering edit mode
8.2 years ago
clausndh ▴ 50

Dear all Biostars users,

I updated the question

we try to understand the way how Cuffdiff calculate the p-values from Jensen–Shannon divergence between isoforms. Therefore I attached the 'critical' part of the cuffdiff code.

As far as I understand the the code, they sample FPKM values from all FPKM values of the first condition ('PREV')

For example, we have 2 isoforms with 3 replicates each

      Prev1 Prev2 Prev3  Curr1 Curr2 Curr3
Iso1   p11  p12    p13     c11  c12   c13
Iso2   p21  p22    p23     c21  c22  c23

and if I got the comment

// Draw k values, from prev's fpkm_samples to make the first half of the null, where k is the number of replicates in curr, and compute their average

correct. Then they sample k=3 times one of the columns Prev1, Prev2 or Prev3 . Is this correct ?

Further, I'm not sure that they are doing in this row :

curr_set_sample += prev_abundance.member_fpkm_samples()[next_sample_idx] / (double)curr_abundance.rg_props().size();

Afterwards, they just calculate the mean of sampled data. And out of mean they calculate the relative amount of the isoform at the total gene expression.

Thanks a lot for your help.


Github Code of JSD method of Cuffdiff [ https://github.com/cole-trapnell-lab/cufflinks/blob/753c109e31818dcf7aa8a2c8ecdac4fa43d2ab9b/src/differential.cpp#L558 ] Line 619 to 638

vector<double> null_js_samples;

static const size_t num_null_js_samples = 10000;

boost::random::mt19937 rng(random_seed);

boost::random::uniform_int_distribution<> prev_sampler(0, prev_abundance.member_fpkm_samples().size()-1);

boost::random::uniform_int_distribution<> curr_sampler(0, curr_abundance.member_fpkm_samples().size()-1);

for (size_t i = 0; i < num_null_js_samples; ++i){

    // Draw k values, from prev's fpkm_samples to make the first half of the null, where k is the number of replicates in *curr*, and compute their average

    Eigen::VectorXd curr_set_sample = Eigen::VectorXd::Zero(curr_abundance.abundances().size());

    for (size_t k = 0; k < curr_abundance.rg_props().size(); ++k){

        int next_sample_idx = prev_sampler(rng);

        if (next_sample_idx >= 0 && next_sample_idx < prev_abundance.member_fpkm_samples().size())

            curr_set_sample += prev_abundance.member_fpkm_samples()[next_sample_idx] / (double)curr_abundance.rg_props().size();

    }

    double total_curr_set_sample_fpkm = curr_set_sample.sum();

    if (total_curr_set_sample_fpkm > 0.0)
        curr_set_sample /= total_curr_set_sample_fpkm;
RNA-Seq Cuffdiff Cufflinks sqrt(JS) • 2.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2941 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6