Disparity in number of read strandedness for Long Insert identification
0
0
Entering edit mode
9.6 years ago

I'm testing Pindel's detection of Long Inserts using a modified bacterial genome sequence as a reference and real Illumina paired end reads for said bacteria. To test the long insert capability of Pindel, I'm deleting 3000 bases from one of the contigs in the genomic reference. Because the reads now contain sequence that's missing from the modified reference, it should appear to Pindel as an insertion.

The current version of Pindel (downloaded from Github on August 26, 2014 on the master branch) is able to find the long insertion at nearly the correct position, but the number of + reads is much larger than the number of - reads (ratio is 449 to 4). That doesn't seem right.

In contrast, a simulated inversion has a ratio of + 967 to - 978.

Anyone have any idea what's going on?

Pindel • 1.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6