Creating a personalized genome assembly for read simulation from illumina WGS data

0

Entering edit mode

8 hours ago

scsc185 ▴ 80

I have whole genome illumina sequencing data for an individual, and I would like to generate a sample-specific genome that I can later use for read simulation. After some discussion with ChatGPT, I’ve outlined the following workflow:

Align and call variants against a reference genome
Phase variants to distinguish maternal and paternal haplotypes.
Generate consensus FASTA sequences for each haplotype

My goal is to end up with two FASTA files (one for each haplotype) that approximate this individual’s genome and then use it to simulate Illumina reads. I am not familiar with this type of workflow, so I am wondering if anyone has done something similar in the past and could sanity check the above workflow. Any suggestions on best practices and improvements are appreciated.

illumina consensus wgs phasing simulation • 60 views

ADD COMMENT • link 8 hours ago by scsc185 ▴ 80

Login before adding your answer.