Difference between genome assembly and genome sequence alignment to a reference to find structural variants 1kgp SGDP
1
0
Entering edit mode
3.8 years ago
m4r1n4 • 0

Hello,

I'm trying to determine what the difference and benefits of genome assembly and genome sequence alignments are when trying to identify structural variants or transpoosons in populations.
I've been scouring the internet but have only really come across the difference between short vs long reads and de novo assembly vs reference-based.

My understanding is that to identify variations in structural variants within a population there seems to be 2 main comparative genomic methods, the first being what the 1KGP and SDGP did and sequence the whole genome, align the reads to the reference genome and end up with a BAM file.

The second is to assemble personal genomes and then compare or align the assemblies to each other and the reference genome or using the Lastz/LiftOver/ChainNets Examples: 10.1016/j.gene.2005.09.031

Thanks in advance.

Assembly alignment transposon genome • 1.3k views
ADD COMMENT
1
Entering edit mode
3.8 years ago
shimbalama ▴ 10

HI,

Your question is a little unclear but I think I understand. You're talking about assembly Vs mapping for SV detection, right? To be clear, you can't do genome sequence alignments, except maybe for something as small as a virus like COIVD19. I was further confused, as a third option exists - to do MSA of reads that have been identified as chimeric (map to a SV breakpoint). People use mapping, MSA and assembly (whole genome or local) to understand SVs and all are valid. Differences and especially benefits come down to the specifics of the various algorithms. Generally though, the main benefit of assembly is that it is reference free, ie, no a priori bias always for the identification of novel SVs. However, I often map assemblies to a reference anyway (unless it completely novel seq) as it's useful to describe it based on known genomic loci.

I hope that helps.

ADD COMMENT
0
Entering edit mode

Thank you for your response,

Can you expand on why a reference creates an a priori bias when trying to identify novel SVs? I'm finding it difficult to grasp the importance of needing a genome assembly when more affluential projects like 1kgp, Human genome diversity project and Simons genome diversity project aren't generating assemblies and are just mapping reads to the reference.

ADD REPLY

Login before adding your answer.

Traffic: 1779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6