Hi,
The number of "Full-Length Non-Chimeric Reads" reduced nearly half from "Full-Length Reads". Is this normal?
I started my isoseq pipeline with the raw subreads data and have run the first few steps:
(1) CCS calling: ccs m54067_190809_002958.subreads.bam m54067_190809_002958.ccs.1.bam --log-level INFO --report-json example.report.json --hifi-summary-json example.hifi_summary.json --log-file example.ccs.log --report-file example.report.txt --metrics-json example.zmw_metrics.json.gz --chunk 1/10
;
(2) Primer removal and demultiplexing: lima --isoseq m54067_190809_002958.ccs.bam TeloPrime_V2_primer.fasta
;
(3) Refine: isoseq refine m54067_190809_002958.fl.TeloPrimeModified_5p--TeloPrimeModified_3p.bam TeloPrime_V2_primer.fasta m54067_190809_002958.flnc.bam --require-polya
;
The library is built from TeloPrime Full-Length cDNA Amplification Kit V2. Here is the contents of my primer.fasta file:
TeloPrimeModified_5p TGGATTGATATGTAATACGACTCACTATAG TeloPrimeModified_3p CGCCTGAGA
The following is the summary report after running isoseq refine:
{ "_comment": "Created by pbcopper v2.3.99", "attributes": [ { "id": "sample_name", "name": "Sample Name", "value": "" }, { "id": "num_reads_fl", "name": "Full-Length Reads", "value": 284393 }, { "id": "num_reads_flnc", "name": "Full-Length Non-Chimeric Reads", "value": 135263 }, { "id": "num_reads_flnc_polya", "name": "Full-Length Non-Chimeric Reads with Poly-A Tail", "value": 123504 } ], "dataset_uuids": [], "id": "isoseq_refine", "plotGroups": [], "tables": [], "title": "Iso-Seq Refine Report", "uuid": "8f1243dd-7d26-425e-8031-236ec6aecd6e", "version": "1.0.1" }
The number of FLNC reads (135263) reduced too much compared to that of FL (284393). Is there something wrong here? Hopefully I can get your suggestions. Thanks!