alignment issue after assembling sequences
1
0
Entering edit mode
8.7 years ago
andynkili ▴ 10

I'm having trouble to understand the meaning of having overlapping HSP after a BLAST.

The thing is: I have a sequence from NGS (Illumina Miseq) and assembled with SPAdes 3.6. When I BLAST it I got overlapping HPS. The contig I get is a bit bigger (around a hundred nucleotides) than the BLAST hit indeed, but since HPSs overlap, shouldn't they have been assembled previously? I mean is that an assembly issue(/error) or something else?

alignment Assembly sequencing • 1.7k views
ADD COMMENT
1
Entering edit mode
8.7 years ago
h.mon 35k

This is probably a real duplication on your genome, not an error. It is not shown on the blast output because:

  1. it is not found on related sequences on the blast database, either due to misassemblies on the database sequences or because it is a new duplication
  2. Maybe blast parameters prevent it from showing on blast output

You may map your reads to the assembled genome and look at this region on IGV or other genome browser to check this region and get a feel if it is an assembly artifact or a real structural variation.

ADD COMMENT
0
Entering edit mode

I thought it could be a disassembly because that case happened often (18 out of 25 contigs) and for genomes that don't have a large diversity (I think it should eliminate the assumption of a real duplication).

I may have forgotten that I'm working on one viral family, circular genome, and what baffles me the most is that concerning18 out 25 contigs (having the size of an entire genome from this family) the corresponding overlapping HSPs seem back to front regarding reference genomes (blast hit): the end of the contig is aligned with the beginning of the ref genome and vice versa for the start of the contig and the end of the ref genome.

At first glance I thought it was all about circularity (since I verified the order of the ORFs in the contigs compared to the one from the original genomes, and it was the same (don't know if I'm making myself clear)), but I got rid of the part in the contig that make the genome circular (the biggest repeat located at the extremities of the contig) and after re-assembling I still have this "overlapping-back-to-front-hsps" situation.

I really don't know what to think about this! I'm still torn between misassembles or just a circular issue, that's why I wanted to know more about the meaning of those overlapping HSPs.

ADD REPLY
0
Entering edit mode

It seems you could design a pair of primers to solve this question: make the primers flank the offending region, and design them to make easy differing the expected PCR sizes (with and without duplication) on the gel.

ADD REPLY

Login before adding your answer.

Traffic: 3013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6