Machine learning Algorithms on DNA sequences
0
0
Entering edit mode
22 months ago

i am working machine learning project,i have 4 different sequences of human from different regions and want to develop machine learning model

DNA ML R • 612 views
1
Entering edit mode

That's great and all, but it isn't a question.

A model for what?

What sequences?

0
Entering edit mode

The sequences are not normal patient sequences ,so there must be common mutation(due to disease ) between all those sequences, on that basis i want to develop model . I am confused that from where i should start ..Any example or something else will really help full.

0
Entering edit mode

Right, so if there's a mutation - why do you need to arbitrarily apply ML to this problem? Variant calling pipelines are well established.

You also haven't told us what the data is still. Have you got reads? Whole genomes? What state is the data in? Has it been QC'd?

0
Entering edit mode

whole Genome of patients..the data is in RAW form(Simple FASTA format). ML model for predict same type of sequence using RAW sequences.

0
Entering edit mode

What will the model predict that non-ML methods do not already do?

0
Entering edit mode

I don't see how ML fits with this question?

You have a dataset of sequences with mutation X. You receive new data, and need to check if it has mutation X or not. Why guess this with ML, when you can literally just look at the base-pair position in the new data and see if it has mutation X or not?

0
Entering edit mode

With respect, if you do not even know where to start and if this even will be helpful then maybe are more well-defined project might make sense and most importantly => an experienced supervisor is required.

1
Entering edit mode

What do you want the machine to learn? A model is supposed to learn to predict something tangible based on other tangible input. You only have sequence data, what do you wish to predict from that?

0
Entering edit mode

The sequences are not normal patient sequences ,so there must be common mutation(due to disease ) between all those sequences, on that basis i want to develop model . I am confused that from where i should start ..Any example or something else will really help full.

0
Entering edit mode

Please don't copy/paste the same content. You don't need to reply to each comment if you don't have anything specific to say for that comment.