Question: clustering best choice
gravatar for mina
6.1 years ago by
mina20 wrote:

Hi friends,

After assembly , sometimes for each locus we get more than one transcript ( I thinks because of 1. alternative splicing or 2.several gene from same family that share same domain 3. sequencing from different part of genes but did not overlap to match other fragment, Am I right? ).

and that s why when I did blast against one specific gene I got several hits. So now , how can I choose the best hit of one gene ? I mean which transcript of one locus is the best for further study like templat for design primer?

Reciprocal BLAST?


Thank you in advance


rna-seq assembly cluster • 1.7k views
ADD COMMENTlink modified 5.8 years ago by pengchy430 • written 6.1 years ago by mina20

What is the purpose of the primers? Real-time qPCR? Amplify genomic DNA? Diagnostic for one particular gene / transcript? Genomic copy-number estimation of a family of genes? For primer design, the best transcript is the one covering your region of interest, and avoiding unwanted similarities.

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by h.mon32k
gravatar for pengchy
5.8 years ago by
pengchy430 wrote:

Indeed, the denovo assembly lead to false transcripts derived from you mentioned reasons. To filter the redundancy or the noise, one always to select one representative transcripts for transcripts clusters. This can be done using TGICL and cd-hit-est. The former clustering transcripts based on sequence similarity and assemble every cluster separately using cap3 assembler. While the later using the short words algorithm cluster the sequences according to predefined criteria and select one representative (the longest) and discard others for every cluster. You can have a try.

ADD COMMENTlink written 5.8 years ago by pengchy430
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2496 users visited in the last hour