Hi,
How can I merge two paired end fastq (R and L) to give a single fastq file ? For information, the sequencing run is 72 bp long and it contains a majority of small RNA (miRNA,...) so a lot of paired end reads will overlap.
For example here's two paired reads :
@HWUSI-EAS529:41:FC62YHFAAXX:8:1:7969:1330 1:N:0:GCCAAT
CTACGAAAGGGCACTTGGAATTCTCGGGTGCCAAGGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCT
+
IIIIIIIHIIHIIIIIIHHIIIHGIIIIEIIIIIIEIIHIIIIIIIIIIIHIIIIIBHIHIIHGIGIEGHHEGEEH
@HWUSI-EAS529:41:FC62YHFAAXX:8:1:7969:1330 2:N:0:GCCAAT
AGTGCCCTTTCGTAGGATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIDHIGIIIHIIIGHGIIIIIIIHHIHIIIIIIIIIHIIIIIIIIHIIGIIIIHI
I find the adapter in the first one:
Code:
EMBOSS_001 1 CTACGAAAGGGCACTTGGAATTCTCGGGTGCCAAGGAACTCCAGTCACGC 50
|||||||||||||||||||||
EMBOSS_001 1 ---------------TGGAATTCTCGGGTGCCAAGG-------------- 21
EMBOSS_001 51 CAATATCTCGTATGCCGTCTTCTGCT 76
EMBOSS_001 22 -------------------------- 21
but not in the second one
But I effectively found the overlap between the right read and the left read (using the reverse complement of it)
EMBOSS_001 1 -------------------------------------------------- 0
EMBOSS_001 1 TTTTTTAATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACA 50
EMBOSS_001 1 -----------CTACGAAAGGGCACTTGGAATTCTCGGGTGCCAAGGAAC 39
|||||||||||||||
EMBOSS_001 51 GTCCGACGATCCTACGAAAGGGCACT------------------------ 76
EMBOSS_001 40 TCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCT 76
EMBOSS_001 77 ------------------------------------- 76
So my question is, how can I merge the two fastq files to produce a single fastq file?
Thanks,
N.
Hi, I see that you had a similar case like me, so probably you can help me :)
As I always do the miRNA analysis in single end I'm confused how to proceed when I have paired-end? Can you recommend me how to clean the reads and have them ready for analysis, particularly I cannot understand how and what is the relation of the reverse-compliment miRNA sequence in R2 read to the R1 set?
In summary my R1 read is containing 100nt - miRNA+barcode+smallRNAadapter+another adapter+polyA
my R2 is containing miRNA (reversed compliment to R1) + long adapter (or linker) + polyA
Thanks for any help in advance!
Hi,
So for an exact answer to this problem.
The
R1.fq
are the forward reads and theR2.fq
are written in reverse-complement.For example, if I want to create a single file from reads in R1.fq and R2.fq, I have to do "reverse-complement" of reads in R2.fq?
Am I right?
Thank you
no response for this problem?