Hello,
I will soon be receiving Epstein-Barr virus sequence data, with > 800x coverage. There's a nice paper that compares de novo assembly tools (Zhang W, Chen J, Yang Y, Tang Y, Shang J, et al. (2011) A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies. PLoS ONE 6(3): e17915. doi:10.1371/journal.pone.0017915). Based on this paper, I'm thinking of using Edena as a de novo assembly tool, although other tools seem to be good as well (see figure 6 of paper).
My question is, I would like to compare the de novo assembly with a reference based one. Which reference based aligner do you recommend for a virus?
Thank you,
Anna
Thank you, Istvan, you're the best! Do you think VIRAMP is ready enough for me to use it or should I follow the recommended steps on my own? Thanks.
the only caveat is that the server that we run there may not have enough storage, it does have a few 100 GBs
but we want people to try it. Ideally all you need to upload are the fastq and reference genome and it all takes it from there.
let's see what happens, also send feedback/questions to Yinan Wan (yzw128@psu.edu) she is in charge of making this work
Istvan, I tried your tool (it's great!!!). The default pipeline ran without any problems!! However, any variation from the default failed for me, whether it was changing the de novo assembler from velvet, or whether it was running some of the de novo tools outside of the pipeline. Does Yinan have viewing privileges to my session? That would make it much easier for her to fix bugs if there are any since everything I did this morning is there including the tasks that failed. In addition, QUAST promised more detail in the download version and I was expecting to see the nice graphs from your example, but they were missing in the download version too.
Hi Anna,
Thanks for trying out the viramp. Could you make an account on the viramp and share the history with me? You can refer to this link to learn how to share history (my account is just the email address listed above). I cannot identify userless datasets, and more importantly since this is a demonstrated platform with limited space, userless datasets are purged within certain time, but datasets associated with one account will be kept. If you have problem sharing the history, please at least make an account and run everything under that so I will try to identify the datasets from the database.
The QUAST bug has been fixed, thanks for pointing out. And you can email me any specific questions/issues you encountered during processing.
Best,
Yinan
Thank you, Yinan, for fixing QUAST !!
I have shared my history with you. As you predicted my work from this morning was wiped out but I created an account as you suggested and I recreated the most important problem for me, namely, that I cannot run the paired-end pipeline using VICUNA instead of velvet. As you can see from my history, I have tried using the default kmers, the highest default kmer only (65), the lowest default kmer only (35), a kmer of 20, and none worked with VICUNA.
You need to scroll down in the history as I have found that using velvet with k=20 works much better for my data than using the default pipeline and so all those successful velvet jobs are at the top so that I can move my project forward.
Thanks a lot!
Can you please send her an email and let's see if you can work this out.