Question: PacBio Reads subreads and scraps files
10 months ago by
bio_d0 wrote:

I have a PacBio library that I have converted from bam to fastq.gz files. Now when I hope to use them to correct, trim and assemble I cannot figure out whether I should use both the subreads and scrap files or only the former. The rationale for leaving out the later is Subreads because of the description provided in SMRT Link User Guide (from PacBio website). It says "Subreads that contain information such as double-adapter inserts or single-molecule artifacts are not used in secondary analysis, and are excluded from this file and placed in scraps.bam."

So if I don't use in assembler ( e.g. CANU) or PacBio Correction Tool (e.g. LoRDEC), am I doing it correctly or should I use both subreads and scraps files. Please let me know if I am doing it wrong.

Thanks in advance.

ADD COMMENTlink modified 10 months ago by tjduncan190 • written 10 months ago by bio_d0
10 months ago by
Indianapolis, IN
tjduncan190 wrote:

For assembly you would only need the "Subreads.bam".

You would only need the scraps.bam if you had multiplexed samples in your data-set that needed to have their barcodes identified to be de-multiplexed via CCS.

See: Brief primer and lexicon for PacBio SMRT sequencing for more info on file formats.

"Unaligned BAM files representing the subreads will be produced natively by the PacBio instrument. The subreads.bam will be the starting point for secondary analysis. In addition, the scraps arising from cutting out adapter and barcode sequences will be retained in a scraps.bam file, to enable reconstruction of HQ regions of the ZMW reads, in case the customer needs to rerun barcode finding with a different option."

ADD COMMENTlink written 10 months ago by tjduncan190

Thank you for the explanation and the link.

ADD REPLYlink written 10 months ago by bio_d0
10 months ago by
United States
genomax55k wrote:

See this comment from Dr. Richard Hall (he is with PacBio) over at SeqAnswers.

ADD COMMENTlink written 10 months ago by genomax55k

Thank you very much.

ADD REPLYlink written 10 months ago by bio_d0
