combine two fastq file (paired end) to submit in NCBI
1
0
Entering edit mode
6.6 years ago
kk.mahsa ▴ 140

hi everyone

how can i combine two fastq file resulted paired end sequencing into one fastq file for submitting in NCBI?

i used cat command and i got that this command is not useful for me.

i want one fastq file same below pattern

@HWUSI-EAS529:41:FC62YHFAAXX:8:1:7969:1330 1:N:0:GCCAAT
CTACGAAAGGGCACTTGGAATTCTCGGGTGCCAAGGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCT
+
IIIIIIIHIIHIIIIIIHHIIIHGIIIIEIIIIIIEIIHIIIIIIIIIIIHIIIIIBHIHIIHGIGIEGHHEGEEH


@HWUSI-EAS529:41:FC62YHFAAXX:8:1:7969:1330 2:N:0:GCCAAT
AGTGCCCTTTCGTAGGATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIDHIGIIIHIIIGHGIIIIIIIHHIHIIIIIIIIIHIIIIIIIIHIIGIIIIHI
SRA fastq • 4.1k views
ADD COMMENT
3
Entering edit mode
6.6 years ago
GenoMax 143k

Use reformat.sh from BBMap suite to interleave the R1/R2 files.

reformat.sh in1=file_R1.fq.gz in2=file_R2.fq.gz out=interleaved.fq.gz
ADD COMMENT
0
Entering edit mode

errors like these killed runing:

java.lang.AssertionError: 
Error in /media/dr/MyBook/SRA_for_NCBI/HI_R1.fastq, line 468541011, with these 4 lines:
@HWI-ST916:209:D2FKVACXX:2:2101:5128:87558 1:N:4:
AAAAGETTTAAACPCAGAGTCGCTGCCTTCCCAACTCGTTCATCTTCTCAGCTGCCTTTTCCAGTTGAGTTCAGGGCTCTATCCAGATGTGCACCAATAC
/

CCCFFFBFHHHHHJJJIJIIHJJJJJJJJJJJJJJJJJJJJJJIJIJIJICHGJJJIJJJIIJGHIIJIHGJJIJIHHHHHFFFFFFEEEEEEDDDDDDD


java.lang.AssertionError: 
Error in /media/dr/MyBook/SRA_for_NCBI/HI.R1.fastq, line 3838143, with these 4 lines:
@HWI-ST916>209:D2FKVACXX:1:1101:13731:47390 1:N:0:AAATGAGAATAAAAATGGGGAAAACCAAATCPGTTATCATTGCCTGCACCAGGAAPGGGAGACPGGCCTAGAAGAGAGGGGTTCAAAGCCACTGGAGTAG
+
=??BDDDDDFFHHIIIJIIIHIGIGGIJJGGEGCHGGGIGIHIIGIGHIDHCGHCFCGHGGHIII?HEEHCHFFFBEDCDDBBDDDDDDBCCAC@C849>
@HWI-ST916:209:D2FKVACXX:1:1101:13269:47390 1:N:0:

how can i fixed fastq files?

ADD REPLY
0
Entering edit mode

What have you done to these files before getting to this point? It appears that they have been edited/tampered with in some way. You could try to run fastqValidate to verify your fastq files.

ADD REPLY
0
Entering edit mode

i renamed it by adding .fastq

my files are in fastq format (four line per read) but in their name there is not .fq or .fastq so i renamed it and added .fastq then run reformat.sh on them and got above problem.

i gathered my files via cat command successfully, can i submit output of cat command into SRA in NCBI?

ADD REPLY
0
Entering edit mode

is it possible that file be harmed due to renaming?

ADD REPLY
1
Entering edit mode

Just renaming the files should not have caused this problem. They may have been corrupted when you downloaded them from your sequence provider.

cat'ing the files together does not achieve interleaving of the reads (which is what you showed in the original post). What you will have is entire contents of R1 followed by entire contents of R2, if you use plain cat R1.fq R2.fq > new.fq. If that is acceptable to SRA then you could do it that way.

ADD REPLY
0
Entering edit mode

thanks genomax for your kindly advises

ADD REPLY

Login before adding your answer.

Traffic: 1816 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6