Entering edit mode
                    2.6 years ago
        Sebastien_Vigneau
        
    
        ▴
    
    10
    What would be a good tool or collection of tools to calculate a confidence score for each nucleotide in a genome assembly using short-read data, ideally taking into account both the reads pile-up and each read's sequencing quality score, and able to handle SNPs and INDELs?
Hi, I do not really know what you mean by "confidence" score. Would it be a measure of the probability of misassembly at a specific position? Assemblies are often assessed as a whole and not in a per-base manner.
It may not be exactly what you want, but Pilon has a
--vcfparameter to produce a.vcffile listing detailed information about base and indel evidence at every base position in the genome. See: https://github.com/broadinstitute/pilon/wiki/Output-File-Descriptions#vcf