Entering edit mode
4.0 years ago
lokeshp14cs24
•
0
Hello everyone, My name is Lokesh and i have to make a dna database and analysis tool using hadoop as my university project. Can anyone kindly tell me, what is the best method to store complete DNA sequences ( 3 billion base pairs approx.) in hadoop ecosystem? Any help will be greatly appreciated.
Dear Lokesh,
There were some discussions going on in the past, just two threads that you may find interesting:
Why is Hadoop not used a lot in bio-informatics?
Hadoop InputFormat for FASTA files?
etc.