Question: Resources For Extracting Information From A Sequence
6
gravatar for Will
9.0 years ago by
Will4.5k
United States
Will4.5k wrote:

I'm running a competition at Kaggle.com on HIV-1 Progression ... check it out if you're interested, there's a 500 USD prize in it for the winner! There have been a number of machine-learning researchers with no biology background looking for a resource which can extract information about a NT sequence (or batch of sequences) that they can use as "feature-sets" for their machine-learning algorithms.

So far I've suggested k-mers, multiple-alignments, and known resistance mutations. I've even provided code for finding the count of all k-mers in a sequence. Does anyone have any other suggestions ... especially if they have tools that can do the feature-extraction.

Thanks a bunch, Will

sequence prediction tool • 1.7k views
ADD COMMENTlink written 9.0 years ago by Will4.5k
1

Interesting competition.

ADD REPLYlink written 9.0 years ago by Istvan Albert ♦♦ 80k

link to the competition: http://kaggle.com/hivprogression

ADD REPLYlink written 9.0 years ago by Giovanni M Dall'Olio26k
5
gravatar for Simon Cockell
9.0 years ago by
Simon Cockell7.3k
Newcastle
Simon Cockell7.3k wrote:

You can do a lot of feature extraction with EMBOSS tools. From GC-content to finding palindromic sequences. Plenty of these tools could be used to build feature sets.

ADD COMMENTlink written 9.0 years ago by Simon Cockell7.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1429 users visited in the last hour