Question: How else can one quantify a DNA sequence besides GC-content and length?
0
gravatar for jolespin
3.8 years ago by
jolespin120
United States
jolespin120 wrote:

I'm looking for other ways of quantifying sequences besides looking at GC-content and length? I could look at coverage and stuff but what about just from the raw sequence?   The size of the sequences are gene-length to contig-length of good assemblies (1000 nt - 200 000 nt)

seq sequence gc gene genome • 1.5k views
ADD COMMENTlink modified 3.8 years ago by Lars Juhl Jensen11k • written 3.8 years ago by jolespin120

the phrasing of your question implies you are looking for quantification related to physical properties.... and probably of a relatively short DNA sequence ... is that correct? 

If not, there are TONS of ways. Please disambiguate.

ADD REPLYlink written 3.8 years ago by Vincent Laufer1.0k
5
gravatar for 5heikki
3.8 years ago by
5heikki8.4k
Finland
5heikki8.4k wrote:

kmer content

ADD COMMENTlink written 3.8 years ago by 5heikki8.4k

can you use this as a single value? I know I could get the % for each kmer in the sequence but I would have multiple values.  Maybe stdev of it? 

ADD REPLYlink written 3.8 years ago by jolespin120

Well, you could get a single value of e.g. how many unique tetramers there are in a given sequence. I'm not so sure how useful that would be though.

ADD REPLYlink written 3.8 years ago by 5heikki8.4k
3
gravatar for Pierre Lindenbaum
3.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

Shannon entropy https://books.google.fr/books?id=MRdpjtDbUcQC&lpg=PA56&dq=Shannon%20entropy%20blast&pg=PA55#v=onepage&q=Shannon%20entropy%20blast&f=false

size after compression,

...

 

ADD COMMENTlink written 3.8 years ago by Pierre Lindenbaum120k
2
gravatar for JC
3.8 years ago by
JC7.8k
Mexico
JC7.8k wrote:

Time ago I wrote some code for quantify different DNA properties (compostios, bias, complexity): https://github.com/caballero/SeqComplex

ADD COMMENTlink written 3.8 years ago by JC7.8k
1
gravatar for Matt Shirley
3.8 years ago by
Matt Shirley9.0k
Cambridge, MA
Matt Shirley9.0k wrote:

Composition vectors have been used to build phylogenetic trees from sequences: http://www.aporc.org/LNOR/12/ISORA2010F02.pdf

These are somewhat related to kmers though.

ADD COMMENTlink written 3.8 years ago by Matt Shirley9.0k
1
gravatar for Lars Juhl Jensen
3.8 years ago by
Copenhagen, Denmark
Lars Juhl Jensen11k wrote:

I hate to bang my own drum, but you can find plenty of DNA structural parameters and other relevant metrics in my work on DNA atlases from 15 years ago (Jensen et al., 1999; Pedersen et al., 2000). This includes parameters such as predicted DNA curvature, flexibility and stability.

ADD COMMENTlink written 3.8 years ago by Lars Juhl Jensen11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1001 users visited in the last hour