Forum:Developing a genomics ML model using bytes?
0
0
Entering edit mode
15 hours ago
Pranava ▴ 10

I was looking into training a Machine Learning / Deep Learning Model using Bytes. Recently I was working on a way to decrease the size of a .fasta file using _bit shifting_ (i.e, converting one nucleotide which is normally 8 bytes and can be bought down to 4 bytes using this method)

And now that we are in the age of Machine Learning and Artificial Intelligence dominating the Industry or at least there has been a trend of that it got me thinking what if we can use the bytes to develop a model? The problem I can currently think of is it might .... might not be biologically relevant? I am not sure this is where I kinda started getting confused and Wanted to reach out on here.

Learning Machine Genomics Bytes • 1.2k views
ADD COMMENT
0
Entering edit mode

how about fastq.gz?

ADD REPLY
0
Entering edit mode

So essentially you work on a compression algorithm, is it? If so, be sure to bechmark your idea against the hundreds of existing and fast compression methods, be it standard things such as gzip, bzip, zstd, or genomics-centered methods such as CRAM.

might not be biologically relevant?

What does compression have to do with biology? Please explain better if I miss the point.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

what if we can use the bytes to develop a model?

My understanding is that the OP is considering building a model using the sequences in fastq files, so the compression is just an intermediate step. However, it is not clear to me what model s/he has in mind... Some sort of LLM using fastq data...?

ADD REPLY

Login before adding your answer.

Traffic: 3784 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6