Difference between 'raw reads' and 'spots' in PacBio and SRA
0
2
Entering edit mode
4.9 years ago
MattP ▴ 20

I have been working with some PacBio RSII data in SMRT Portal version 2.3.0, and when looking through the report files for the P_Filter step of an assembly, I get a pre-filter read total of 150,292 (465.0 Mbp). However, upon uploading the data to SRA (https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR5811295), I see that the number of 'spots' is 163,482 (950.0 Mbp). As far as I can tell from various genome announcement articles, the number of spots should be equivalent to the number of raw reads, but I can find no mention of the number 163,482 anywhere in the files of the associated SMRT Portal job [note: see EDIT below]. Apologies if there's a simple answer to this that I'm overlooking, but can anybody please help me to figure out why this apparent discrepancy exists, and why the number of 'spots' reported at SRA appears nowhere in the SMRT Portal reports?

EDIT: One additional piece of information - the only time I can find the number 163,482 in relation to the job files is the number of lines in the data/filtered_summary.csv file. Grepping for a 1 in the PassedFilter column gives the expected post-filter read number of 68,619, however I can't currently figure out the criterion for getting to the 'pre-filter' read number of 150,292...

PacBio SMRT • 1.2k views
ADD COMMENT
0
Entering edit mode

Just based on your post, I wonder if P_Filter is the number of reads passing filter (not sure what the criteria are if that is the case) whilst the number of spots is all reads. Did you upload a BAM file to the SRA or a FASTQ file and if so, can you type cat input.fastq|paste - - - - |wc -l to see how many total sequences you have in your FASTQ file that you uploaded to the SRA?

ADD REPLY
0
Entering edit mode

I uploaded .bax.h5 and .bas.h5 files to SRA, rather than a .fastq file

ADD REPLY
0
Entering edit mode

Well perhaps it still has to do with reads passing filter...not sure

ADD REPLY
0
Entering edit mode

I don't know whether the edit at the bottom of my original post is of any more help? It looks almost as if there's an additional 'pre-pre-filter', but I can't find any mention of it in the output, hence my confusion...

ADD REPLY

Login before adding your answer.

Traffic: 1991 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6