Question

pair-end, mate-pairs coding

0

Entering edit mode

8.1 years ago

kamiljaron ▴ 220

Hello,

recently I received quite big dataset of three pair-end libraries (350, 550, 700 is) and two mate-pairs libraries (Nextera, 3000 and 5000 is). I ve started with Trimmomatic creating really nicely trimmed pair-ended libs. This procedure failed badly for mate-pairs libraries, due to principle of mate-pair sequencing. So, I ve took quite recent and very nicely looking NxTrim, which does exactly what I need (cutting nextera adapters and sorting reads), but it outputs all mp and pe reads in one file. So, I assume, that mate pairs are coded in the headers of reads, but I am wondering how... And how to sort reads to R1 and R2 files respectively. Because, I still need to trim sequencing adapters from the sorted mate-pairs libraries and Trimmomatic expects paired reads in separated files on the input.

So, does anybody know, how exactly are paired reads recognised?

sequencing Assembly sequence next-gen • 2.6k views

ADD COMMENT • link 8.1 years ago by kamiljaron ▴ 220

0

Entering edit mode

I found that this merged fastq files are called interleaved FASTQ (found at webpage of another trimmer adapterremoval). Using this, I ve googled a bash script which is probably solution for my problem, but not really answering a question - How it is coded. Is it just the order of sequences? The bash script is just soring odd reads to R1 file and even reads to R2 file...

ADD REPLY • link 8.1 years ago by kamiljaron ▴ 220

0

Entering edit mode

maybe you can give us some read header from the mp and pe files so we can see how the differences is encoded!

ADD REPLY • link 8.1 years ago by Phil S. ▴ 700