Question

Questions about sample_accession_id and file_accession_id from EGA files

0

Entering edit mode

11 weeks ago

askif4 ▴ 20

I recently downloaded some data from the EGA, including .cram files and the associated metadata. However, I am having some difficulty understanding how to match samples to the files correctly.

As shown in the attached screenshot, a single sample (e.g., sample_accession_id EGAN00001380723) is associated with multiple file_accession_ids (e.g., several EGAFxxxxx entries).

enter image description here

Does this mean that the sequencing data for this sample was uploaded in multiple .cram files, and I should merge these files to reconstruct the full dataset for that sample?

Unfortunately, I have not been able to reach either the first or corresponding author of the original publication, so I’m reaching out to see if you might have encountered a similar situation.

EGA open WGS data • 377 views

ADD COMMENT • link updated 11 weeks ago by GenoMax 153k • written 11 weeks ago by askif4 ▴ 20

0

Entering edit mode

Based on the EGA help page there is a way to download the metadata, which should give you more information.

Registered EGA users can download metadata of an authorised dataset by logging into the EGA webpage and navigating to the dataset of choice. Approximately two thirds down the page you will find the option to download the metadata as a zip file.

ADD REPLY • link 11 weeks ago by GenoMax 153k