Finding De Novo Protein Domains In A Set Of Transcript Sequences
0
0
Entering edit mode
10.9 years ago

I have around 1,000 transcript sequences without any protein domain annotations (no hits in pfam, smart, panther...) and I want to see if there are any enriched amino acids motifs.

I was thinking maybe:

1) perform an all vs all tblastx

2) gather all non-overlapping HSPs excluding alignments to self

3) Extract HSP sequence from alignment.

4) Blast each extracted HSP sequence to the transcripts

4) Each transcript that hits the HSP sequence will be counted as containing the protein motif.

This seems a bit over-complicated to me. Are there any software or packages that already does this?

transcriptome motif domain • 2.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 3426 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6