Question: How To Move After Denovo Contig Assembly Of Bacterial Seq
2
gravatar for kanwarjag
6.3 years ago by
kanwarjag950
United States
kanwarjag950 wrote:

I am relatively new in de-novo assembly, so please accommodate my question if it is very basic and appear to be naïve. I a working on MI-seq data 2X150bp of a bacterial strain. I used Valvvt (51kmer) and it gave me contigs which looks like this:

NODE253250length121cov1.338843 TGCCTGCTCTTCTGCTTTTCTACCATGTTATGATGCAGTATGAACGCCCTTGCCAGAAGCTGCTGC NODE253255length105cov1.000000 TGGAAGCCCCACTCTCAGTATTGACGTGCAAGTTCACAGTCTGGTTCCTGCCCCCGCGGT------

I have a reference genome of bacteria too. Now I want to pin point in which sample bacteria is present or not. Based on my literature reading- since genome is small I performed denovo assembly. However how from the above contigs I will found out which one is best and useful and showed that bacterial is present? What parameters should I be using- length of contig or something else to find out which one has to be more useful? If I use blast align pairwise alignment with reference, it takes a while and return an error message Bad Gateway perhaps the contig file is large (69523word). Any suggestion or pointers will be highly appreciable.

assembly miseq denovo • 2.7k views
ADD COMMENTlink modified 6.3 years ago by Lee Katz2.9k • written 6.3 years ago by kanwarjag950
2

It's troubling that you have a coverage of 1 on some of these contigs in the assembly. You should try out VelvetOptimiser if you are comfortable on the command line. Or, you should remove these low coverage contigs (maybe anything<10 and length<150).

ADD REPLYlink modified 6.3 years ago • written 6.3 years ago by Lee Katz2.9k
1

Can you be a little more clear with what you research question is? Are you sequencing a pure culture of an unknown bacteria (why are you asking "now i want to pin point which sample bacteria is present or not")?

Sounds like you are asking two questions here: one about your methodology and another about your problem with BLAST. I think you need to clearly define your methodology and research question first. Second, we can try to figure out why you are having a BLAST error.

ADD REPLYlink modified 6.3 years ago • written 6.3 years ago by Josh Herr5.6k
1

No it is one basic q- I ran seq on samples and want to see if a particular bacteria is absent/ present. Have done denovo assembly of seq data- to generate contig. How should I handle these contigs to point out which one is significant and is corresponding to reference bacteria?

ADD REPLYlink written 6.3 years ago by kanwarjag950
2
gravatar for Lee Katz
6.3 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

I think that you'd just

  1. Make a blast database of your assembly
  2. Create a fasta list of the genes that you are looking for (maybe from a reference genome--just something similar)
  3. BLAST against your database with your genes.

The assembly is done. This is a question of presence or absence, which you can get from BLAST. You should do this on a command line and not the pairwise BLAST web page.

ADD COMMENTlink written 6.3 years ago by Lee Katz2.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1791 users visited in the last hour