An alternative to haplotype phasing - assembly?
0
2
Entering edit mode
6 months ago
Jess ▴ 60

Hi everyone,

I am having many problems with some haplotype phasing. I have tried using variant calling followed by whatshap polyphase but it does not reliably get the correct answer. I think this is due to some issues with the fact that my PCR clearly has bias so my variant ratios are not 50:50 so when I say my ploidy is 2, it messes up.

My region of interest is quite small and as is shown in the screenshot below, I have two clear types present and yet I am struggling to find the right tools to split them up. Ultimately, I want a multi-fasta with those two types, or, as will be the case for other samples, up to 5 types might be present.

IGV example of my region of interest: enter image description here

My most recent idea has been to extract those reads from my BAM and to use a stringent assembly tool to form what I hoped would be two contigs of 275bp with each type. I used the following commands:

samtools fasta -F 4 aligned_reads.bam > bam.fasta

spades.py --only-assembler -s bam.fasta -o spades_output -k 55,77,99 --careful

This has resulted in contigs which do not specifically span my region of interest, although some of the contigs do contain parts of my region of interest. I figured something is going wrong here because my extracted reads are whole reads, not reads clipped to my reference sequence.

I have two questions:

1) Does anyone have a better idea how I can 'extract' the two haplotypes which are clearly visible in my BAM? 2) If no better ideas, does anyone know how I can export those reads as a fasta but keep the soft-clipping so that no reads contain sequence outside of my 275bp window?

Thank you in advance!

samtools assembly BAM spades phasing • 455 views
ADD COMMENT

Login before adding your answer.

Traffic: 3090 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6