Question: SRA from NCBI?
0
gravatar for star
8 days ago by
star130
Netherlands
star130 wrote:

I like to get some data to reanalysis them, so I start to get SRA from NCBI but I faced some question?

  • for example: for SRR1067431 and SRR5331265, there are 5 runs with one GEO number and 1 run with a GEO number, respectively. Does it mean that there are 4 technical replicates for SRR1067431 that I can merge them from the first step (merge their FASTQ files)?

  • is there any way to check md5sum for FASTQ file or I only should use vdb-validate from sratoolkits?

  • why md5sum is changing, when FASTQ file change to FASTQ.GZ?

rna-seq fastq sra ncbi • 98 views
ADD COMMENTlink modified 8 days ago • written 8 days ago by star130
2

You will need to look through the metadata to see what the actual samples represent.

Additional download help: Fast download of FASTQ files from the European Nucleotide Archive (ENA) and https://ewels.github.io/sra-explorer/

ADD REPLYlink modified 8 days ago • written 8 days ago by genomax64k
3
gravatar for ATpoint
8 days ago by
ATpoint14k
Germany
ATpoint14k wrote:

The first one looks like a lane/sequencing replicate. As always with these, they probably can be merged right away into one file prior to alignment. Still, for quality control you could keep them separated and merge lateron. md5 is specific for every file, so of course (de)compression changes it. Any need to check that? Do you have indications that some are corrupted? I never felt the need to do that and never experienced any problems with sra files given the download completed properly.

ADD COMMENTlink modified 8 days ago • written 8 days ago by ATpoint14k

Thanks, No they are downloaded completely.

ADD REPLYlink written 8 days ago by star130
2

Should be fine then. fastq-dump will throw an error anyway if something is corrupted so don't waste your time on md5 ;-)

ADD REPLYlink written 8 days ago by ATpoint14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2083 users visited in the last hour