Synchronization Of Pair-End Reads
2
4
Entering edit mode
12.2 years ago
Plantae ▴ 390

After adapter trimming and quality filtering, I found some reads are too short (eg. 10bp),
so I delete all reads shorter than 25bp.
However, after this manipulation, some reads are missing from one PE library,
(eg. read1 was presented in 5'library, but missed in 3'library)
I try to re sync the PE library, but i did not find efficient methods.
I wrote a perl script to do this job, using hash table to store reads in each library, but the memory requirement is too high.
do anyone have some memory efficient methods to do this job?

paired • 4.1k views
ADD COMMENT
3
Entering edit mode
12.2 years ago
Frenkiboy ▴ 250

Have you tried using trimmomatic? I takes care about the read pairing.

ADD COMMENT
1
Entering edit mode
12.2 years ago
brentp 24k

Usually, this is done by the trimmer. However, with your constraints, you could use a bloom filter (in perl) backed by a hash.

There's an example of how you'd do this here.

But, the simplest way is to use something like Sickle that does this for you as @frenkiboy suggests.

ADD COMMENT

Login before adding your answer.

Traffic: 2031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6