I ran blastx with 10 results for each sequence. I noticed 2 issues with the results:
1) For most input sequences, each sequence got results for the same gene / protein, which is unlike in blastn in which every input sequence got multiple results. For instance, one input sequence was roughly 48.5k base pairs. All 10 results were for the same gene, which was only 377 base pairs. It's possible to have only one short gene in such a long sequence, but that happens with every input sequence. Would it be unreasonable to expect to have more genes in such sequences, at least in some sequences?
2) The other issue is that the results for the sequences show that the input sequences are much shorter than they are (also unlike in blastn results). For instance, one input sequence was roughly 21.5k base pairs. The results indicated that it was only 4655 base pairs.
When considering these two issues together, it seems that blastx aligns only a small part of each sequence. I tried increasing the number of results but got a similar output.
Any insights from your experience will be welcomed.
Here's a sample of the uotput: