To compare assembled transcripts of two species
4
1
Entering edit mode
8.4 years ago
suxiaopei ▴ 10

Hi guys,

I am comparing the assembled transcripts datasets of two parasites to see how similar are they, i.e., the percentage of match between these two species. I tried BLASTN, but I can only see matches for each query, not a overall match percentage. Is there anyway to do so?

Thanks very much for any suggestion!

Xiaopei

RNA-Seq blast • 3.4k views
ADD COMMENT
0
Entering edit mode
8.4 years ago
Chris Cole ▴ 800

BLAST only ever works with single queries so will report data for each one separately. If you want a single, 'overall' match percentage you can determine it by taking the average of all the single queries.

However, a single overall measure is rarely very meaningful. A boxplot or histogram describing the distribution of transcript similarities will be a nicer way to represent the information.

ADD COMMENT
0
Entering edit mode
8.4 years ago
h.mon 35k

You could use cd-hit-est-2d to cluster the two transcriptomes at several thresholds, starting from higher similarities and then decreasing.

edit: BLAST is local search algorithm, and it is not suitable for estimating match percentage, as it will generally inflate it as it leaves out from the alignment portions of the transcript that not pass a certain threshold score. See here for a discussion on this (on another context, though).

ADD COMMENT
0
Entering edit mode
8.4 years ago

Try Mauve. It is a nice program to compare genomes and transcriptomes

ADD COMMENT
0
Entering edit mode
8.2 years ago
Adrian Pelin ★ 2.6k

I think you need to rethink your question. You have transcriptimes from non-model species. That means that you only have a rough idea of the gene number in these organisms. What I suggest is a comparative approach. Why use blastn? You are comparing transcriptomes here, coding sequences mostly. Try tblastx of species 1 vs species 2, sequences that return no hits are unique to species 1. Then do the same thing for species 2 vs species 1, and find which transcripts are unique to species 2. Then blast the unique transcripts with blastx vs nr to find out what is unique to each species.

ADD COMMENT
0
Entering edit mode

Hi,

I am interested in doing what you said. I would like to compare genome of human and mouse to see what genes are absent from each of them compared to the other. However, blastx has only one box for entering the genome sequences. How should I do this?

ADD REPLY
0
Entering edit mode

Search for orthologs database. For example, OrthoDB and OMA have these genes pre-calculated for you.

ADD REPLY
0
Entering edit mode

Thank you for your reply. But wouldn't this give me genes that are similar? I want to find genes that are different. I was able to blast the two genomes using the command line but I am not sure how to use that information for finding the actual gene differences.

ADD REPLY

Login before adding your answer.

Traffic: 1507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6