Entering edit mode
2.2 years ago
Vaishnavi
•
0
Hi,
I'm trying to convert SRA fastq data to Illumina basespace format. (Illumina basespace app has certain criteria to get the fastq files uploaded for analysis)
Header of my SRA data will look something like this:
@SRR071195.1 HWUSI-EAS729_105074074:2:1:1225:1032/1
ATGGATAGCAGCCCAGCAATATTCACAGTAATACTGCAGACAGGTAACATTAGCACAGAAAAATGGAGCAAATTTCCCCCCCAAACGGGACCCCTGACAT
+
?A78DD<BD8<?BDD<8BE=DG@G@GD>G=DDB:=@EGBDG:?@GE?-EEDD=??GGD8:G@D;=GDG:G==B4DC:3A#####################
I need to convert it to this format
@SIM:1:FCX:1:15:6329:1045:GATTACT+GTCTTAAC 1:N:0:ATCCGA
TCGCACTCAACGCCCTGCATATGACAAGACAGAATC
+
<>;##=><9=AAAAAAAAAA9#:<#<;<<<????#=
where, delimited columns are @<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos>:<UMI> <read>:<is filtered>:<control number>:<index>
Kindly help!
TIA!
You can try converting the SRA data again using
-F
option forfastq-dump
. This will re-create the original Illumina fastq format, IF submitter's had submitted the data that way (necessary info mostly seems to be there e.g.HWUSI-EAS729_105074074:2:1:1225:1032
). If not, you will need to write some custom code to change this to format BaseSpace wants.