How to handle more than 1 SRA Run per Experiment?
1
0
Entering edit mode
6 weeks ago
Janmajay • 0

While downloading some raw WGBS data from the Roadmap Epigenomics project, I noticed that multiple SRA Runs were associated with each Experiment.

Reading the FAQs here, I was under the impression that only 1 Run can be associated with each Experiment. I saw that the Experiments only differed in the "Bases" and "Bytes" columns.

For example, BioSample SAMN00857854 (GEO Accession: GSM916051) was sequenced with Illumina HiSeq 200 and has one Experiment SRX142783 with the following SRA Runs:

Run : ['SRR1143696' , 'SRR1143697' , 'SRR1143700' , 'SRR1143702' , 'SRR1143704']

which correspond to:

Bases : [48905968410 , 49852911810 , 34303272200 , 18904063200 , 34950365000]

Bytes : [33485451056 , 32870075947 , 24289423164 , 13323536868 , 24641285065]

as the only metadata values that differ across Runs.

How do I handle these Runs? Is the correct way to:

  1. Concatenate FastQ files from multiple Runs into one file before preprocessing? OR
  2. Preprocess FastQ files from each Run separately and treat as technical replicates?
SRA Preprocessing • 297 views
ADD COMMENT
1
Entering edit mode
6 weeks ago
ATpoint 82k

Concatenate FastQ files from multiple Runs into one file before preprocessing?

Yes, that is the common thing to do. It's the exact same library just sequenced over multiple lanes.

ADD COMMENT

Login before adding your answer.

Traffic: 1739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6