Question: Clustering Blast overlapping alignments
gravatar for chefarov
4.9 years ago by
chefarov130 wrote:

Hello all,

I have assembled a (C. Elegans) genome from raw dna-seq reads and I have come up with (repeat-masked) fasta file of scaffolds. I aligned a random EST seq onto the scaffolds using blast, thus I have a plain text or xml file with the alignments.

I want to go on following the "A beginner’s guide to eukaryotic genome annotation" guide, by Mark Yandell and Daniel Ence, which mentions (about processing blast result):

"... the remaining data are sometimes clustered to identify overlapping alignments and predictions. Clustering has two purposes. First, it groups diverse computational results into a single cluster of data, all supporting the same gene. Second, it identifies and purges redundant evidence; highly expressed genes, for example, may be supported by hundreds if not thousands of identical ESTs."

I can only image the two aforementioned cases as the same case. I mean getting multiple ESTs aligned onto a specific gene is overlapping results that could be clustered together. What else could the first case ( "diverse results all supporting the same gene" ) refer to? Isn't it the same thing?



ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by chefarov130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1662 users visited in the last hour