PacBio amplicon reads partially aligned using minimap2 – library or analysis issue?
0
0
Entering edit mode
9 days ago

Hi all,

I'm currently working on a PacBio amplicon sequencing project. The libraries were constructed using M13-barcoded primers, and our goal is to detect indels or large deletions within specific regions of the amplicons.

After demultiplexing, I performed additional trimming to ensure that all reads in the final dataset were full-length amplicons. Specifically, I filtered for reads containing both the first and last 30 bp of the target sequence (allowing up to 5 mismatches), so the expectation is that:

The reads may contain indels or large internal deletions, but both the 5′ and 3′ ends should match the reference and therefore align properly.

However, when I aligned these reads to the reference using minimap2 and visualized the results in IGV, I noticed that many reads were only partially aligned, including in the negative control library, where no deletions are expected. (See figure.) alignment result using minimap2

Here is another example where I am expecting some of the reads to have indels or large deletions: enter image description here Here's the minimap2 command I used:

minimap2 -x map-pb -a --secondary=no --end-bonus=100 --score-N=0 "$ref_file" "$fastq_file" | samtools view -Sb - | samtools sort -o "$output_bam" | samtools index "$output_bam"

To validate this, I also tried aligning the same reads using pbmm2 with the CCS preset:

pbmm2 align /path/to/MYC_PCR.fa yll504.bam /path/to/yll504.aligned.bam --preset CCS --sort --min-length 50 --best-n 1 --log-level DEBUG --num-threads 4

Unfortunately, the result was similar—many reads still showed only partial alignment, even though they should cover the full amplicon ends.

My questions: Could this be caused by an issue during library construction (e.g., degradation, incomplete extension)? Could the alignment settings be suboptimal for detecting reads with large internal deletions while preserving full-end alignment? Are there better aligners or parameters suited for this type of amplicon with variable internal structure? Has anyone successfully aligned PacBio amplicons like this with consistent end-to-end mapping?

Any suggestions or insights would be very helpful—thank you in advance!

PacBio minimap2 alignment variants pbmm2 • 550 views
ADD COMMENT
1
Entering edit mode

Only including this as an alternate option to try (if your reads are HiFi) : https://github.com/PacificBiosciences/pbAA

but both the 5′ and 3′ ends should match the reference and therefore align properly.

How long are these amplicons? What is the size of the deletions that you are expecting. What is the length of the two ends that should match (looks like 3'-ends go to the end of the amplicon, if the shots show the full length) so it must be the 5'end that may be getting soft-clipped. IGV will not display soft-clipped reads unless an option is turned on so you may want to try that.

It is certainly possible that something unexpected happened during the experimental phase and that is reflected here.

ADD REPLY
1
Entering edit mode

GenoMax has a good point about trying pbaa. The tool will output consensus sequences of the different alleles. You can then use MSA to align the consensus sequences and the reference sequence. I suspect that the mapping looks funky because there's an insertion that is causing the mappings to terminate (via clipping).

ADD REPLY
0
Entering edit mode

Hi GenoMax,

Thanks so much for your suggestion! Those amplicons are around 4.4kb. And there should be at least 1kb from the left and 800 bp from the right that should match the reference.

ADD REPLY

Login before adding your answer.

Traffic: 2412 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6