Hello, everyone
I don't know this problem, recalibration matrix and Deduplication matrix read size.
At the first, I used the tools Gatk DeduplicationSpark and the step is dedup.bam -> recal.bam But, I will check the matrix so is very strange. Because dedup total read size example) 700,000 -> recal is example) 1,900,000 what happened to increase read size?
I think two problems. or more than the problem.
- original data experiments problem
or
- Gatk DeduplicationSpark optional is a problem ( check the opticalDuplicatePixelDistance ? or default size? )
====================================================== Example data)
- Dedup.bam matrix
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED UNMAPPED_READS UNPAIRED_READ_DUPLICATES READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE Unknown Library 380347 72110005 327484 615657 257747 4243867 ...
- recal.bam matrix
- BAIT_SET BAIT_TERRITORY BAIT_DESIGN_EFFICIENCY ON_BAIT_BASES NEAR_BAIT_BASES OFF_BAIT_BASES PCT_SELECTED_BASES PCT_OFF_BAIT ON_BAIT_VS_SELECTED MEAN_BAIT_COVERAGE PCT_USABLE_BASES_ON_BAIT PCT_USABLE_BASES_ON_TARGET FOLD_ENRICHMENT HS_LIBRARY_SIZE HS_PENALTY_10 36632661 1 11542540300 4893421367 0 1 0 0.702273 315.088776 0.700896 0.403819 59.46704 471059915 ...
Best regards.