aligning Illumina mate-pair/ log mate pair libraries using BBMAP
0
0
Entering edit mode
4.6 years ago
bio_d ▴ 20

Hi,

I am trying to align Illumina mate-pair (5.2Kb) and long mate-pair(10kb) libraries to an assembly of contigs (obtained from CLC workbench) for a reptilian genome (Genome size comparable to humans). I am using bbmap to do the alignment step following which I hope to extract insert size distribution (avg. insert size and std. dev.) to create input files for ALLPATHS-LG tool and carry out a De Novo Assembly.

Code snippet:

Is this the correct way to align illumina mate pair(5.2Kb)/ long mate pair(10Kb) libraries.

No matter what combination of the flags rcs=t/f rcomp=t/f I use the standard error file shows that "Processing reads in paired-ended mode.". I can't understand how to use the flags correctly because in the UsageGuide it is suggested that one should use requirecorrectstrand=f (rcs=f) and rcomp=t for long mate pair. However, since I am getting mean insert size of 2611.48 when I am expecting something around 5200bp (predicted 5.2Kb Illumina mate pair library from sequencing center) it is most likely the flags are overridden by the default values which are rcs=t and rcomp=f (which is why I presume bbmap is processing reads in the paired-end mode).

Best, D

bbmap alignment • 1.3k views
0
Entering edit mode

I am getting mean insert size of 2611.48 when I am expecting something around 5200bp (predicted 5.2Kb Illumina mate pair library from sequencing center)

If you are interested in full range then plot the histogram of insert sizes (ihist=<file>). Generally there will be a range of inserts and a mean of 2.6 kb is what you have in that library. This is likely accurate than the prediction of the sequencing folks since it is based on actual alignments.

BTW: The processing message may be just an oversight in programming. @Brian will confirm.

0
Entering edit mode

Tagging: Brian Bushnell