Question

Identify specific sequences in genome from Illumina Nextgen data

0

Entering edit mode

7.8 years ago

vaheesan.r • 0

Hi everyone, I am trying to analyse the whole genome sequences of Pseudomonas strains generated from Illumina Next Generation Sequencing platform and my aim is to identify insertions of specific family of transposons (around 10 kb) in the genome and retrieve it. I have access to CLC genomics Workbench. Can someone suggest me any method to find the sequences. P.S- I have tried to assemble the sequences but I could only obtain contigs less than 100 Kb. Thank you

Assembly genome alignment illumina sequencing • 1.4k views

ADD COMMENT • link 7.8 years ago by vaheesan.r • 0

score 1 · Answer 1 · 2016-06-23

1

Entering edit mode

7.8 years ago

GenoMax 141k

If you are limited to using CLC then you should contact CLC tech support for assistance. Assuming your libraries are good quality and you have enough coverage, CLC should be able to assemble the genome in larger sizes than what you got. Have you tried to align the data to the reference already available?

If you are open to using other command line tools then give SPAdes a try for the assembly. Once you have the genome assembled Mauve would be useful to look at genome-wide similarities.

ADD COMMENT • link 7.8 years ago by GenoMax 141k

0

Entering edit mode

Hi, thanks for your reply. I have used Velvet De Novo assembly and Spades via illumina basescpace platform. I have also used a5 miseq and Mauve, but unfortunately I was able to obtain contigs smaller than 100 kb. Finally I have started using CLC Genomics Workbench and aligning data with reference also did not help. As I have mentioned earlier, my aim is not sequencing the genome but identifying the specific transposons. The contings I obtained provide partial sequence of the transposons.

ADD REPLY • link 7.8 years ago by vaheesan.r • 0

0

Entering edit mode

If you have done all that then it is possible that your libraries are less than optimal quality. Of course without having access to your data/results it would be hard for anyone on this forum to reach any definite conclusion/provide advice.

You could try CLC tech support route and/or consult someone locally to get a second opinion about your data. If things don't look good then you may need to redo the experiment/libraries.