Streptococcus genome alignments
0
0
Entering edit mode
7.5 years ago
skbrimer ▴ 740

Greetings Hive Brain,

I am having issues with mapping reads of Streptococcus suis to the RefSeq genome. When I use it was a reference I only get about 30% of the reads mapping to it. When I do a de novo build I get the expect number of base pairs and when I blast the contigs they come back as Strep suis.

I know the streptococcus pneumoniae has a lot of internal rearrangement and I suspect that s.suis does as well, has anyone had any experiecne with either organisum and would be willing to give me any advice for assembly.

I'm using Ion torrent single end data, average fragment size is 280bp.

Thanks,

Sean

Assembly mapping • 2.1k views
ADD COMMENT
0
Entering edit mode

what happens if you map your reads against you de novo assembly?

ADD REPLY
0
Entering edit mode

Pretty much the same thing that I get when I just map the reads. If I use bwa mem I get a lot of hard clipping. I need to play with the stringency I think. I was also going to try either the pacbio or nanopore settings.

ADD REPLY
0
Entering edit mode

By "pretty much the same thing" you mean only 30% of the reads map to the de novo assembly?

ADD REPLY
0
Entering edit mode

Oh. Excellent question I didn't look at that. What I meant by basically the same is when I look at the mapping in igv I get the same areas of coverage. I will look at the total mapping and get back to you. We currently don't have power due to a storm but when it comes back I will look.

ADD REPLY
1
Entering edit mode

While you are at it map the assembly you have to the reference using mauve. That should give you an idea of what your assembly looks like at the genome scale compared to the reference.

ADD REPLY
0
Entering edit mode

I will download and give it a try

ADD REPLY
0
Entering edit mode

The Mauve alignment shows lots of gene shuffling, this was a great idea! Is it possible to use this alignment to order the contigs?

ADD REPLY
1
Entering edit mode

Yes it is possible to do that.

ADD REPLY
0
Entering edit mode

Thank you for the link!

ADD REPLY
0
Entering edit mode

Also, and I suspect unsurprisingly (yey circular genomes), it looks like the start of the genome and the end of the genome is in one contig of the de novo build.

ADD REPLY
0
Entering edit mode

HAHA! The power is back on!

According to samtools flagstats 82% of the contigs map back to the assembly, however they get clipped heavily and the output in IGV matched closely to the short read mapping.

top, short reads. bottom contigs

ADD REPLY
0
Entering edit mode

Also because of the library I know that the de novo mapping should leave gaps due to the repeat areas but I am getting large drop outs. It's more like I have a really bad reference. However I have been reading other strep suis papers and they all seem to map to this ref.

ADD REPLY
0
Entering edit mode

Might it be that you are using a mapper that is tuned for illumina instead of ion-torrent data, since you did not mention the mapper you used. If you have enough coverage (~60X) then the de-novo assembly should be better if you have performed pre-processing or used an assembler that handles ion-torrent like in this case.

ADD REPLY
0
Entering edit mode

Sorry, I am using BWA as the mapper and SPAdes as the de novo assembler. TMAP is a variant of BWA and I do use both. I have found that they give only slightly different builds since the read quality of Ion Torrent has improved. For most builds I do use BWA over TMAP mostly due to the speed. There are some cases where the default BWA settings outperforms the default TMAP settings, mostly in viral assemblies, however if you adjust the min-seed-length flag in TMAP from 11 to 19 they perform exactly the same.

ADD REPLY

Login before adding your answer.

Traffic: 2213 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6