PacBIO LAA barcoding problems
1
0
Entering edit mode
3.3 years ago
a.m.dekker • 0

Dear all,

I have a question regarding the PACBio smrtlink (V. 6.0.0.47841) long amplicon analysis (LAA) on an externally performed PACBio Sequel sequencing run (of which, unfortunately, we don't have any upstream information). Our research involves pooled samples that were barcoded symetrically. Anyways, I would like to perform laa to demultiplex and obtain amplicon consensus clusters. The documentation informs to use the subreads.bam as input. If using barcoded data (as we do) a fasta file with barcode sequences should be provided. When running the tool as shown below, the tool will refuse to perform a barcoded amplicon analysis.

~/smrtlink/smrtcmds/bin/laa -b ../barcodes_kg.fasta --minLength 2500 --noPhasing --Clustering ../data/pacbiodata/m54031_181005_130744.subreads.bam

It gives the following messages (below) with output files that clearly did not do barcoded laa. With the information that the output gives, I am now assuming that either my input bam doesn't contain barcode info fields (as it mentions not able to read subreads from metadata?) or that some other parameter is missing??

I really hope there is someone out there with experience in this kind of analysis that is willing to help me out! Thanks a lot!

~/smrtlink/smrtcmds/bin/laa -b ../barcodes_kg.fasta --minLength 2500 --noPhasing --Clustering ../data/pacbiodata/m54031_181005_130744.subreads.bam

|> 20190114 07:57:03.497 -|- INFO -|- AmpliconAnalysis -|- 0x7f091e236c00|| -|- found consensus models for: (P6-C4, S/P1-C1.1, S/P1-C1.2, S/P1-C1.3, S/P1-C1/beta, S/P2-C2, S/P2-C2/5.0, S/P3-C3/5.0)

|> 20190114 07:57:03.498 -|- INFO -|- AmpliconAnalysis -|- 0x7f091e236c00|| -|- using consensus models for: (S/P2-C2/5.0)

|> 20190114 07:57:03.527 -|- INFO -|- UsePacBioIndices -|- 0x7f091e236c00|| -|- found .pbi index for file: /data/btr/bulk1/tmp/2018-10-22-fs-pacbio-analysis/from_bam/190110 _annabel/../data/pacbiodata/m54031_181005_130744.subreads.bam

|> 20190114 07:57:03.527 -|- INFO -|- UsePacBioIndices -|- 0x7f091e236c00|| -|- .pbi indices found or generated for all bam files - fast indexing enabled

|> 20190114 07:57:04.670 -|- INFO -|- GetBarcodePairs -|- 0x7f091e236c00|| -|- preprocessing of data files found 0 post-filter barcodes and took 1.143s

|> 20190114 08:14:27.613 -|- WARN -|- Fill -|- 0x7f091e236c00|| -|- Could not read total subreads from input metadata!

|> 20190114 08:14:27.613 -|- INFO -|- Fill -|- 0x7f091e236c00|| -|- filtering took 1042.94s and found 5267998 post-filter subreads

|> 20190114 08:14:27.613 -|- INFO -|- DoAmpliconAnalysis -|- 0x7f091e236c00|| -|- analyzing subset 'All' with 2000 subreads

|> 20190114 08:14:27.714 -|- INFO -|- GeneCluster -|- 0x7f091e236c00|| -|- aligning 400 of 2000 subreads

|> 20190114 08:14:30.254 -|- INFO -|- GeneCluster -|- 0x7f091e236c00|| -|- subread alignment took 2.54s

|> 20190114 08:14:30.538 -|- INFO -|- GeneCluster -|- 0x7f091e236c00|| -|- coarse clustering generated 5 clusters and took 0.283s

|> 20190114 08:14:30.914 -|- INFO -|- GeneCluster -|- 0x7f091e236c00|| -|- subread ranking took 0.376s

|> 20190114 08:14:35.389 -|- INFO -|- PoaCluster -|- 0x7f091e236c00|| -|- poa alignments and ranking took 0.339s

|> 20190114 08:14:35.389 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #0, phasing/consensus of 500 reads

|> 20190114 08:14:48.910 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #0, skipping phasing by user request

|> 20190114 08:15:34.443 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #0, 1 of 1 models converged

|> 20190114 08:15:45.018 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #1, phasing/consensus of 484 reads

|> 20190114 08:15:57.201 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #1, skipping phasing by user request

|> 20190114 08:17:24.081 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #1, 1 of 1 models converged

|> 20190114 08:17:33.054 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #2, phasing/consensus of 420 reads

|> 20190114 08:17:43.827 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #2, skipping phasing by user request

|> 20190114 08:19:00.667 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #2, 1 of 1 models converged

|> 20190114 08:19:10.088 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #3, phasing/consensus of 253 reads

|> 20190114 08:19:16.652 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #3, skipping phasing by user request

|> 20190114 08:19:43.854 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #3, 1 of 1 models converged

|> 20190114 08:19:49.095 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #4, phasing/consensus of 213 reads

|> 20190114 08:19:54.930 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #4, skipping phasing by user request

|> 20190114 08:21:05.114 -|- INFO -|- FinePhaser -|- 0x7f091e236c00|| -|- cluster #4, 0 of 1 models converged

pacbio long amplicon analysis sequencing smrtlink • 950 views
ADD COMMENT
1
Entering edit mode
3.3 years ago
a.m.dekker • 0

What I noticed is that I was using smrtlink v6, whereas the subreads.bam file had a header saying it was created in smrtlink v5.

Smrtlink v6 does not have tools to barcode subreads.bam files, thus I suspect this is not necessary in this version. For example, V6 already has a builtin feature before running LAA causing the subreads.bam to always be barcoded. I solved this by intallinng smrtlink v5 and performing bam2bam on the scraps.bam and subreads.bam. The resulting output was piped into the LAA, which worked perfectly fine.

ADD COMMENT

Login before adding your answer.

Traffic: 2080 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6