Dear all,
I have performed a RT-PCR which give me a 1500 pb product. I sequenced it with the Illumina technology (2X250 paired-end reads). Then, since several weeks, I unsuccessfully assemble the reads to get the full length sequence. I have tried many of classic assemblers (cap3, ssake, arapan, minimus2, ...) but all of them provide multiple contigs some of which exceeds more than 5kb!
I checked that all the reads are mapped well on the reference.
I am looking for an assembler able to do the job. Is there anyone have an idea?
Thank you!
Hello, genomax was right: I got too much reads. Normalization with bbnorm and assembly with cap3 provide me a very good result. Now I must find the good option values to get a perfect assembly. Thank you all!
A small educational note: if an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
What is the organism you're working on? You mention both align to reference and assemble. Do you wish to do both, and if so, why?
Thank you for your answers. I will rework according to your advices. I remove bad quality reads (nearly no reads removed), remove Illumina adapters, merge read 1 and 2 then I proceed to assembly. I align to a reference because only a part of my cDNA is known. The 3' part is unknown so I need to perform de novo assembly of this part.
is the reference you're talking about genomic or also transcriptomic?