I'm new for PacBio sequencing and I'm trying to understand what exactly makes my FASTQ files... (They were generated from original bax/bas/metadata files by other people)
Based on this figure,
are the sequences in my FASTQ files polymerase reads or subreads? How can I check if adapters are still there?
Thanks!
PS.: I'm already aware that the sequencing mode was CLR (Continuous Long Read), but I can't really understand how the reads are generated in this case...
The same DNA template will be read several times as the image indicates. Since the adapters are known, these will be recognized by the software and will not be part of the reported sequence.
You will get an unaligned BAM file and not a FASTQ file as an output. Various tags on each sequence indicate which waveguide they originate from.
A consensus caller can combine multiple measurements from the same waveguide into a single high-quality sequence.
I have access only to FASTA/FASTQ and bax/bas/metadata files (just to let you know, this data were generated by RS sequencer and I donĀ“t know if the output format is different from the new PacBio sequencers).
But, then, should I try to figure out if are there sequences coming from the same waveguide (with some indication in the sequence ID)?
I have access only to FASTA/FASTQ and bax/bas/metadata files (just to let you know, this data were generated by RS sequencer and I donĀ“t know if the output format is different from the new PacBio sequencers).
But, then, should I try to figure out if are there sequences coming from the same waveguide (with some indication in the sequence ID)?