Converting SRR fastq data to Illumina Basespace data
0
0
Entering edit mode
2.2 years ago

Hi,

I'm trying to convert SRA fastq data to Illumina basespace format. (Illumina basespace app has certain criteria to get the fastq files uploaded for analysis)

Header of my SRA data will look something like this:

@SRR071195.1 HWUSI-EAS729_105074074:2:1:1225:1032/1
ATGGATAGCAGCCCAGCAATATTCACAGTAATACTGCAGACAGGTAACATTAGCACAGAAAAATGGAGCAAATTTCCCCCCCAAACGGGACCCCTGACAT
+
?A78DD<BD8<?BDD<8BE=DG@G@GD>G=DDB:=@EGBDG:?@GE?-EEDD=??GGD8:G@D;=GDG:G==B4DC:3A#####################

I need to convert it to this format

@SIM:1:FCX:1:15:6329:1045:GATTACT+GTCTTAAC 1:N:0:ATCCGA
TCGCACTCAACGCCCTGCATATGACAAGACAGAATC
+ 
<>;##=><9=AAAAAAAAAA9#:<#<;<<<????#=

where, delimited columns are @<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos>:<UMI> <read>:<is filtered>:<control number>:<index>

Kindly help!

TIA!

illumina basespace fastq_header • 647 views
ADD COMMENT
0
Entering edit mode

You can try converting the SRA data again using -F option for fastq-dump. This will re-create the original Illumina fastq format, IF submitter's had submitted the data that way (necessary info mostly seems to be there e.g. HWUSI-EAS729_105074074:2:1:1225:1032). If not, you will need to write some custom code to change this to format BaseSpace wants.

ADD REPLY

Login before adding your answer.

Traffic: 2429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6