A way to get a list of determined and undetermined indexes using illumina interop library?
2
1
Entering edit mode
8 weeks ago

Hi, I would like to get the determined and undetermined indexes from the Illumina interop files. I can only seem to get the determined ones this way:

from interop import py_interop_run_metrics, py_interop_run, py_interop_summary

run_folder = "/path/to/run/folder"

run_metrics     = py_interop_run_metrics.run_metrics()
valid_to_load   = py_interop_run.uchar_vector(py_interop_run.MetricCount, 0)

summary         = py_interop_summary.index_flowcell_summary()
py_interop_summary.summarize_index_metrics(run_metrics, summary)

num_lanes = summary.size()
for read_num in range(num_lanes):
lane_data = []
for lane in range(lane_summary.size()):
lane_data.append(
{
'id':               lane_summary.at(lane).id(),
'project_name':     lane_summary.at(lane).project_name(),
'sample_id':        lane_summary.at(lane).sample_id(),
'index1':           lane_summary.at(lane).index1(),
'index2':           lane_summary.at(lane).index2(),
'fraction_mapped':  lane_summary.at(lane).fraction_mapped(),
}
)


I believe they are in the Stats.json folder, so I could parse that manually but am curious if there is a way to get them through the interop lib.

Any help or clarification would be great!

interop python illumina • 606 views
1
Entering edit mode
7 weeks ago

It turns out that out that Interop only includes known bar codes. bcl2fastq and bclconvert produces files that shown unknown barcodes.

0
Entering edit mode
7 weeks ago

A follow up question, I was wondering if you knew how to get ALL the unknown barcodes, the bcl2fastq stats.json only has top 1000 unknown. Is there a way to get this? If anyone knows this I would be much appreciative!

0
Entering edit mode

How about pulling them straight from I1+I2 of the Undetermined fastq.gz files, if you're running bcl2fastq anyway (and using --create-fastq-for-index-reads)?

0
Entering edit mode

That is by design. You could do what Jesse suggested.