Question

How to handle more than 1 SRA Run per Experiment?

0

Entering edit mode

16 days ago

Janmajay • 0

While downloading some raw WGBS data from the Roadmap Epigenomics project, I noticed that multiple SRA Runs were associated with each Experiment.

Reading the FAQs here, I was under the impression that only 1 Run can be associated with each Experiment. I saw that the Experiments only differed in the "Bases" and "Bytes" columns.

For example, BioSample SAMN00857854 (GEO Accession: GSM916051) was sequenced with Illumina HiSeq 200 and has one Experiment SRX142783 with the following SRA Runs:

Run : ['SRR1143696' , 'SRR1143697' , 'SRR1143700' , 'SRR1143702' , 'SRR1143704']

which correspond to:

Bases : [48905968410 , 49852911810 , 34303272200 , 18904063200 , 34950365000]

Bytes : [33485451056 , 32870075947 , 24289423164 , 13323536868 , 24641285065]

as the only metadata values that differ across Runs.

How do I handle these Runs? Is the correct way to:

Concatenate FastQ files from multiple Runs into one file before preprocessing? OR
Preprocess FastQ files from each Run separately and treat as technical replicates?

SRA Preprocessing • 266 views

ADD COMMENT • link updated 16 days ago by ATpoint 82k • written 16 days ago by Janmajay • 0

score 1 · Answer 1 · 2024-04-13

1

Entering edit mode

16 days ago

ATpoint 82k

Concatenate FastQ files from multiple Runs into one file before preprocessing?

Yes, that is the common thing to do. It's the exact same library just sequenced over multiple lanes.

ADD COMMENT • link 16 days ago by ATpoint 82k