How to get checksums for files on BaseSpace?
1
0
Entering edit mode
6.4 years ago
jbherrick • 0

A sequencing service has shared Illumina files with my BaseSpace account (I'm new to BaseSpace). I'd like to verify the integrity of these (fastq) files with those I have downloaded. How can I find the checksums for these files? (I know how to get them from my downloaded files, just not from those in BaseSpace). Also, should/can checksums be verified for the original files uploaded to BaseSpace from the sequencing center?

Thank you.

md5 • 4.0k views
ADD COMMENT
1
Entering edit mode

Have you asked Illumina/BaseSpace tech support? At least for the public data in basespace I don't see any checksums.

ADD REPLY
0
Entering edit mode

I have not been able to find a tech support link on Basespace. Any idea where it might be found?

ADD REPLY
0
Entering edit mode

Start with techsupport at illumina.com.

ADD REPLY
1
Entering edit mode
6.4 years ago
h.mon 35k

bscp from the BaseSpaceCLI can download directly from BaseSpace and write the md5sums simultaneously.

bscp --write-md5 ...

Maybe some BaseSpace app will calculate md5 on the server side (or you can create an app for that, but I never created a BaseSpace app so can't help here), otherwise you will have to download the data again.

FastQC is really sensitive to data corruption, and will throw out an error or warning (e.g. for sequence and quality of different length, or for a sequence with missing quality), even if just one sequence is at fault.

ADD COMMENT
0
Entering edit mode

I am updating this thread because BaseSpaceCLI seems to have depreciated bscp. I am using bs download project --name $PROJECTID to download reads, however there is no --write-md5 flag available for this function or any other BaseSpaceCLI functions.

With BaseMount, users can navigate the BaseSpace files and run md5sum in-place, but BaseMount is no longer supported (NS2000 reads can't be accessed using BaseMount). Illumina Tech Support refers users to BaseSpace CLI for functions previously fulfilled by BaseMount. My current solution is to download the data twice and compare md5sums of the two copies. Hopefully someone knows a better way.

ADD REPLY

Login before adding your answer.

Traffic: 2008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6