Question: clustering best choice
6.1 years ago
mina20 wrote:

Hi friends,

After assembly , sometimes for each locus we get more than one transcript ( I thinks because of 1. alternative splicing or 2.several gene from same family that share same domain 3. sequencing from different part of genes but did not overlap to match other fragment, Am I right? ).

and that s why when I did blast against one specific gene I got several hits. So now , how can I choose the best hit of one gene ? I mean which transcript of one locus is the best for further study like templat for design primer?

Reciprocal BLAST?


Thank you in advance


rna-seq assembly cluster
written 6.1 years ago by mina20

What is the purpose of the primers? Real-time qPCR? Amplify genomic DNA? Diagnostic for one particular gene / transcript? Genomic copy-number estimation of a family of genes? For primer design, the best transcript is the one covering your region of interest, and avoiding unwanted similarities.

written 5.8 years ago by h.mon32k
5.8 years ago
pengchy430 wrote:

Indeed, the denovo assembly lead to false transcripts derived from you mentioned reasons. To filter the redundancy or the noise, one always to select one representative transcripts for transcripts clusters. This can be done using TGICL and cd-hit-est. The former clustering transcripts based on sequence similarity and assemble every cluster separately using cap3 assembler. While the later using the short words algorithm cluster the sequences according to predefined criteria and select one representative (the longest) and discard others for every cluster. You can have a try.

written 5.8 years ago by pengchy430
