Question: How to split a whole genome sequence file?
gravatar for fashiondesignrussian
4.8 years ago by
fashiondesignrussian50 wrote:

How to split a whole genome sequence file (see also here)? I have 8 GB RAM and my computer chokes with large whole genome sequence .gb and fasta files. Please tell how to split the file and advise a software which can do alignment fast with low memory requirements. I have tried Geneous Pro, CLS Genomics, DNA Star, SeqSphere --- all need at least 16-32 RAM.

sequencing next-gen wgs • 1.8k views
ADD COMMENTlink modified 4.8 years ago by Devon Ryan93k • written 4.8 years ago by fashiondesignrussian50

Split them according to what?

ADD REPLYlink written 4.8 years ago by Devon Ryan93k

Splitting the file should not be a problem (depending on how you wanna split it) the simplest way is to pickĀ  a sliding window size and print everything what is in it, to different files. (Let me know if there is a need for a simple parser do do this or you can manage on your own). As far as the software tool goes, these tools are not designed to run with low memory requirements. Usually they are in memory algorithms. Now depending on what you are trying to achieve, there is a chance for you to get away with splitting the input file, but in order for me to give some advice I would need more information.



ADD REPLYlink written 4.8 years ago by mxs530
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1361 users visited in the last hour