Question: <Cuffdiff> Huge difference between compatible_count and total_count
0
gravatar for jacklee64.tw
5 months ago by
jacklee64.tw0 wrote:

Hi,

I use HISAT2 + Cuffdiff to process my 150PE Mouse RNA-seq data.

Recently, I notice there are huge differences in number between compatible_count and total_count in my result. Many genes were underestimated due to zero "compatible count".

I've check the "XS:A:(+-) and it exists in my SAM/BAM files (Below). I visually check alignments with IGV and nothing is strange.

I also have tried different CuffDiff parameters, like –total-hits-norm or --poisson-dispersion, to see any improvements. But parameters didn't work. The only progress is that correct number of total counts was recognized by CuffDiff (Below)

My questions are:

  1. What features are taken to consider a read-pair compatible or not by CuffDiff ?
  2. Any parameters to increase number of compatible_count?

Thank you very much for your help.

SAM example:

A00123:18:H3MHFD:1:2162:4182:19413  419 1   3054721 1   137M    =   3054721 -137    CTTAGGGGCTTGAGAAAGTTCTCGCCCTCTCACCTGGGGCCTAAGATTGTATCAAGATAACTATGACAATGGCCTGACCTTTAAGGTTCCGCTTCTAACAATCATAAAGCATCCATAGGACTTCCAGGTACCCGCCC   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFF   AS:i:-5 ZS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:5A131  YS:i:-5 YT:Z:CP XS:A:-  NH:i:3
A00123:18:H3MHFD:1:2162:4182:19413  339 1   3054721 1   137M    =   3054721 -137    CTTAGGGGCTTGAGAAAGTTCTCGCCCTCTCACCTGGGGCCTAAGATTGTATCAAGATAACTATGACAATGGCCTGACCTTTAAGGTTCCGCTTCTAACAATCATAAAGCATCCATAGGACTTCCAGGTACCCGCCC   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF   AS:i:-5 ZS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:5A131  YS:i:-5 YT:Z:CP XS:A:-  NH:i:3

Default Cuffdiff message:

[12:34:11] Modeling fragment count overdispersion.
> Map Properties:
>   Normalized Map Mass: 749021.00
>   Raw Map Mass: 752500.47
>   Fragment Length Distribution: Empirical (learned)
>                 Estimated Mean: 272.39
>              Estimated Std Dev: 138.57
> Map Properties:
>   Normalized Map Mass: 749021.00
>   Raw Map Mass: 746990.97
>   Fragment Length Distribution: Empirical (learned)
>                 Estimated Mean: 274.52
>              Estimated Std Dev: 131.43
[12:35:12] Calculating preliminary abundance estimates
[12:35:12] Testing for differential expression and regulation in locus.

total-hits-norm

[15:24:31] Modeling fragment count overdispersion.
> Map Properties:
>   Normalized Map Mass: 52162345.85
>   Raw Map Mass: 55210687.49
>   Fragment Length Distribution: Empirical (learned)
>                 Estimated Mean: 273.32
>              Estimated Std Dev: 140.81
> Map Properties:
>   Normalized Map Mass: 52162345.85
>   Raw Map Mass: 49297576.52
>   Fragment Length Distribution: Empirical (learned)
>                 Estimated Mean: 273.60
>              Estimated Std Dev: 131.36
[15:25:33] Calculating preliminary abundance estimates
rna-seq cuffdiff • 239 views
ADD COMMENTlink modified 5 months ago by RamRS20k • written 5 months ago by jacklee64.tw0
0
gravatar for jacklee64.tw
5 months ago by
jacklee64.tw0 wrote:

Sorry, I find the problem was caused by wrong strandness setting at HISAT2 mapping step.

After changing to right strandness setting, Cuffdiff reports expected compatible_count number.

ADD COMMENTlink written 5 months ago by jacklee64.tw0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 659 users visited in the last hour