Question: How to use a single file as the input to MaSuRCA
4 months ago
United States
kmkdesilva wrote:

Hi everyone,

I generated k-mers of a certain region of my genome of interest using Jellyfish software. Next I extracted certain k-mers from the dump.fa out put file. Now I need to assemble these k-mers into contigs. I read the documentation of MaSuRCA. It need to input files (ex: R1.fa and R2.fa). I only have a single file containing the k-mers I wish to assemble. Can someone please tell me is there a way I can use MaSuRCA using a single input file or are there other assembler I can use to assemble my list of k-mers.

written 4 months ago by kmkdesilva
4 months ago
Russian Federation
shelkmike wrote:

A quote from Masurca's configuration file: "MUST HAVE Illumina paired end reads to use MaSuRCA". So, there is no way to do what you want.

Almost all other assemblers are able to do what you want, you just need to give these k-mers to them as single-end reads. By the way, why do you do an assembly from k-mers and not reads? I have never heard of someone doing so.

written 4 months ago by shelkmike

Thank you very much for the clarification. I am new for this kind of analysis. I thought I should assemble the k-mers but now I understand that I need to find the corresponding reads. I highly appreciate if you can you please tell me how to go back to my data and find the corresponding reads.

written 4 months ago by kmkdesilva

Frankly speaking, I don't understand the purpose of what you do. You have the sequence of a genome, you split it into k-mers, and then assemble these k-mers to produce contigs of this genome. What for, if you already have the sequence of this genome? Could you explain the purpose of your analysis?

written 4 months ago by shelkmike

I am sorry for the confusion. I extracted the k-mers which are common among my genome of interest and several other genomes from closely related species. I am going to do a introgression study.

written 4 months ago by kmkdesilva
