Aligning Mate Pair Data
2
4
Entering edit mode
12.9 years ago
Abhi ★ 1.6k

Hi Guys

I have a long insert Illumina mate pair library. And the confusion is over what would be a good aligner to align this kind of data. Should I need to reverse complement if I need to use BWA and if so why ? Not sure what is the rationale of converting mate pair (<--- ---->) reads to PE ( ---> <---) and how does this affect the BWA mapping.

Could you share your experience with mate pair data handling.

Thanks! -Abhi

illumina paired next-gen sequencing • 6.0k views
ADD COMMENT
0
Entering edit mode

Also wanted to add the issue of Multi reads. Since we expect the genome to have lots of repeats which aligner would be better to give us upto 20 locations where the read maps and is natively built to map multi reads.

ADD REPLY
2
Entering edit mode
12.8 years ago

there are 2 ways you can deal with PE mapping: you can either map both set of reads independently (with BWA for instance) and then perform a manual filter discarding all non-paired reads, or either you can use an aligner that takes the pairing information into account (novoalign for instance). this second option would do the filter for you, so it would be less error prone, although the first one is the most extended methodology as far as I know.

ADD COMMENT
0
Entering edit mode

@Jorge : You are right that for this problem it is worth trying the pairing manually. I am just wondering if any aligner could do it for me given that the insert size of the library is variable.

ADD REPLY
0
Entering edit mode

although I haven't used it, I've read that in fact novoalign supports this step.

ADD REPLY
0
Entering edit mode
12.9 years ago

We have used novoalign with success and there is a free version (though licensing is not terribly expensive). It deals natively with mate pair libraries and can also appropriately treat the paired-end contamination inherent in such libraries.

http://www.novocraft.com/

ADD COMMENT
0
Entering edit mode

@Sean : Ok great. I will try this. Any thoughts why there is an inherent PE contamination in mate pair libraries. I dont understand that part.

ADD REPLY
0
Entering edit mode

The ligation step could fail or the enrichment based on biotin might not be foolproof. Probably someone knows better than I what the exact causes are, but there is some contamination, on the order 10-15% in some libraries I have seen.

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6