Join mapped, overlapping, paired-end reads
1
2
Entering edit mode
6.0 years ago
Mr. Dave ▴ 50

I'd like to combine my paired-end reads that have already been mapped by a PE reference aligner. Is there an existing tool for stitching overlapping, mapped paired-end reads (presumably from SAM to SAM)?

I've taken a look at COPE, PEAR, and FLASH, but it seems that none of these will merge a SAM/BAM. I'm looking at ABySS right now, but I'm not confident that any of these are built for merging non-FASTQ input.

It seems like my only options are either to 1.) stitch my FASTQ pairs prior to mapping or 2.) parse the SAM fields to do the stitching myself. I'd like to work with a validated alignment pipeline, so I'd rather not switch the pipeline from paired-end to single, stitched reads.

Miseq Postprocessing Paired-end • 3.2k views
ADD COMMENT
0
Entering edit mode

Most read merging programs expect fastq files as input since people generally merge reads before aligning etc. You can always convert your BAM back to fastq and then do the read merging. You do know for sure that these reads overlap?

ADD REPLY
0
Entering edit mode

All but the shortest reads overlap. I think extracting the reads of interest from the BAM, converting to FASTQ, and merging will be the most straightforward. I was hoping to rely on the SAM's alignment to merge the reads, but in all reality the self-aligned merge is going to be fine for the regions I've targeted.

ADD REPLY
0
Entering edit mode

Why would you like to do that? Maybe there are other ways to reach your goal

ADD REPLY
0
Entering edit mode
6.0 years ago
mark.rose ▴ 50

Are you looking to merge only reads from a pair or are you looking to derive a consensus from many aligned read pairs?

If the latter (I'm not sure what you would want the former for) there is always the old standby

samtools mpileup -uf reference.fa aligment.bam | bcftools view -cg - | vcfutils vcf2fq
ADD COMMENT
0
Entering edit mode

Just from the pair, unfortunately, but thank you for the mpileup suggestion.

I'd like to do a follow-up alignment for only a subset of reads based on their mapping positions. The initial PE alignment is fairly conventional but follows a validated workflow that I won't be changing. My follow-up alignment works best with SE reads, I'm currently aligning the pairs as SE reads, but I think my results will be much better if I join them.

If I'm not able merge pairs within a BAM, I guess I'll extract the subset of reads in the SAM by position, convert them to fasta (or extract by QNAME from fastq), join them by self/reference alignment, and then start the follow-up alignment. My hope was that stitching pairs within SAM and then exporting to fasta would have been simpler.

ADD REPLY
0
Entering edit mode

In case your reads are non-overlapping BBMap has an option to use a reference to merge them. You can find that thread here. You would have to start with fastq files though.

ADD REPLY

Login before adding your answer.

Traffic: 2136 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6