Question

.clc to fastq

0

Entering edit mode

6.9 years ago

Kasthuri ▴ 300

My collaborator has processed .clc RNA seq data but not the fastq file which needs to be submitted to GEO. He needs to ask the core to get back the raw data, which requires considerable effort. Is there any repository that will accept .clc data? Alternatively, if anyone can recommend how to get back the fastq, that would be great as well. Thanks!

RNA-Seq clc work bench • 2.9k views

ADD COMMENT • link 6.9 years ago by Kasthuri ▴ 300

score 2 · Answer 1 · 2017-05-25

2

Entering edit mode

6.9 years ago

GenoMax 141k

Whoever generated the .clc project should have the original fastq files. If it was the core then they should give you the raw sequence data. Even if you had access to a copy of CLC GW you may not be able to recreate the raw original data since CLC allows for dropping the sequence headers as you import the sequences in, with no means of recreating them, if that option was chosen.

He needs to ask the core to get back the raw data, which requires considerable effort.

Considerable effort to ask for the data or to actually get it? Fastq format sequence data is the primary deliverable for a NGS sequencing project. You (or your collaborator) paid for the sequencing and should insist on getting a copy of the raw data, if none was provided to begin with.

ADD COMMENT • link 6.9 years ago by GenoMax 141k

0

Entering edit mode

Thanks genomax. I think the effort would be that the core would have to dig into the data in order to retrieve it. My collaborator didn't get the fastq file from the computational biologist who processed it and left the job. It is a matter of retracing the data. Anyway, if that needs to be done, he will have to do that. But good to know that recreating the data from CLC may not be possible.

ADD REPLY • link 6.9 years ago by Kasthuri ▴ 300

0

Entering edit mode

Hi - just want to clarify on this point about not being able to recreate the data.

IF the data was imported into CLC and the option to remove sequence read names, header info, etc was selected at the time of import - then yes the exact FASTQ would not be able to be recreated.

However, you can always export NGS reads from CLC back into FASTQ (or other formats) if you want. If the read names, header info is retained (by default it is) when it was imported, then that same data would be in the output exported files as well.

This applies to both the GUI and command line tools for the CLC Genomics Server as well.

Here's a screenshot of what it looks in CLC Genomics Workbench

Import Dialog Illumina Import Window

I know this is like 2+ years later, but I just wanted to put this here so that it was clear that CLC doesn't discard data unless the user wants to. :)

ADD REPLY • link 4.5 years ago by Jonathanjacobs ▴ 280