1
0
Entering edit mode
5.5 years ago

I am interested in calling errors (especially indels) in my PacBio alignments. I want ALL errors to be reported, not only high frequency ones. Thus, the best for me would an alternative to mpileup for Illumina reads. I am considering two options here:

1) using PacBio-specific software, something along the lines of:

variantCaller.py --algorithm=quiver chr22_P6.cmp.h5 -r hg19.fa -o variants.gff
-o consensus.fasta -o consensus.fastq


2) Tweaking mpileup by turning of all the probabilistic realignments, something along the lines of:

samtools mpileup sam/forward/chr21_P6.cmp.h5.bam.forward.bam -f reference.fa
-l feature.bed -uv --no-BAQ --open-prob 15 -Q 0 -D -t INFO/DPR


I have both .cmp.h5 and bam files with alignments.

PacBio mpileup variants calling quiver • 2.1k views
0
Entering edit mode
4.8 years ago
tjduncan ▴ 270

I think that taking a look at the recent biorxiv paper may answer some questions... but significantly complicate things...

Unfortunately I don't know of a great method reliably call "true" errors in your PacBio alignments that are mapped back to the currently used reference genomes (NA12878, PlatGen, GIAB, HG19, HG38 ect). This is because the variants that were annotated as "true" in these reference genomes were generated using primarily short read assemblers and the consensus of many different short-read variant callers. Thus these reference genomes are biased towards the error profile of what Illumina short reads can characterize and will not play well with novel variant calling methods and sequencing technologies (PacBio and ONT) that have different error profiles than short read sequencing.