Question: SMRT Analysis General Questions
4.9 years ago
wrote:

A couple of questions I'm unsure about and can't get a definite answer for (yes, I've looked over the documentation, wikis, etc..):

  1. SMRT Analysis software installed on local server is self contained, correct? i.e.) does not connect/transfer any data to outside servers like PacBios or others.
  2. Do reference and de novo assemblies through the SMRT Portal yield structural variants reports? It doesn't make sense to me that I need to run another job to get variant calls (RS_Minor_Variant)...maybe I'm missing something here. >Note: I also see a 'Corrections' menu, showing INDELS...but are these "corrected", changed, on the final assembly sequence?
  3. Where's/how do I get the final assembled sequence?

Overall, how do I get a de novo assembly with variant call information; what's the correct protocol(s) to follow?

1 - AFAIK the answer is no. PacBio monitors most instruments remotely (so in theory they have access to data on the instrument), unless your instrument is setup with no external network access.
3 - You can find the sequence files in "DATA" section (lower left corner of SMRTportal main summary page for a job). On the command line you should find that data in /path_to/smrtanalysis/userdata/jobs/NNN/NNNNNN/data, where NNNNNN is the job ID from SMRTportal.

Thank you @genomax2 for your answers. I had gone through that directory before, and believe it's the polished_assembly.fasta.gz which contains the final assembly correct?

Correct. You also get the fastq formatted file if you prefer that.

4.9 years ago
wrote:
  1. SMRT Analysis secondary analysis software is completely self contained and does not connect / transfer any data to outside servers. It is possible to have the RS II sequencing machine be remotely monitored by pacbio for tech support purposes, no sequencing data is transferred, and it is entirely optional.
  2. When you denovo assemble a dataset it is not possible to call variants, as the pipeline has no concept of a reference, by definition the assembly has zero variants. To call variants data has to be aligned against a reference, the RS_resequencing pipeline will call SNPs and small indels. For structural variation I would recommend looking into something like PBHoney PB Honey.
