Question: alignment issue after assembling sequences
gravatar for andynkili
4.6 years ago by
andynkili10 wrote:

I'm having trouble to understand the meaning of having overlapping HSP after a BLAST.

The thing is:  i have a sequence from NGS (Illumina Miseq) and assembled with SPAdes 3.6. When i BLAST it i got overlapping HPS. The contig i get is a bit bigger (around a hundred nucleotides) than the BLAST hit indeed, but since HPSs overlapp, shouldn't they have been assembled previously? i mean is that an assembly issue(/error) or something else?

sequencing alignment assembly • 978 views
ADD COMMENTlink modified 4.6 years ago by h.mon29k • written 4.6 years ago by andynkili10
gravatar for h.mon
4.6 years ago by
h.mon29k wrote:

This is probably a real duplication on your genome, not an error. It is not shown on the blast output because:

1) it is not found on related sequences on the blast database, either due to misassemblies on the database sequences or because it is a new duplication

2) maybe blast parameters prevent it from showing on blast output

You may map your reads to the assembled genome and look at this region on IGV or other genome browser to check this region and get a feel if it is an assembly artifact or a real structural variation.

ADD COMMENTlink written 4.6 years ago by h.mon29k

i thought it could be a disassembly because that case happened often (18 out of 25 contigs) and for genomes that don't have a large diversity (i think it should eliminate the assumption of a real duplication).

I may have forgotten that i'm working on one viral family, circular genome, and what baffles me the most is that concerning18 out 25 contigs (having the size of an entire genome from this family) the corresponding overlapping HSPs seem back to front regarding reference genomes (blast hit): the end of the contig is aligned with the beginning of the ref genome and vice versa for the start of the contig and the end of the ref genome.

At first glance i thought it was all about circularity (since i verified the order of the ORFs in the contigs compared to the one from the original genomes, and it was the same (don't know if i'm making myself clear)), but i got rid of the part in the contig that make the genome circular (the biggest repeat located at the extremities of the contig) and after re-assembling i still have this "overlapping-back-to-front-hsps" situation.

I really don't know what to think about this! i'm still torn between misassembles or just a circular issue, that's why i wanted to know more about the meaning of those overlapping HSPs.


ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by andynkili10

It seems you could design a pair of primers to solve this question: make the primers flank the offending region, and design them to make easy differing the expected PCR sizes (with and without duplication) on the gel.

ADD REPLYlink written 4.6 years ago by h.mon29k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1806 users visited in the last hour