reformat.sh ( from the BBTools / BBMap package) has a number of metrics of interest for you:
Histograms for sam files only (requires sam format 1.4 or higher):
ehist=<file> Errors-per-read histogram.
qahist=<file> Quality accuracy histogram of error rates versus quality score.
indelhist=<file> Indel length histogram.
mhist=<file> Histogram of match, sub, del, and ins rates by read location.
ihist=<file> Insert size histograms. Requires paired reads interleaved in sam file.
idhist=<file> Histogram of read count versus percent identity.
idbins=100 Number idhist bins. Set to 'auto' to use read length.
Do you have a citation for what you are aiming to do? Can you elaborate on what you mean by 'percent identity'?
Using BLAST, you should be able to determine the percentage of a particular gene that is covered by your input reads. For example, blastx can taken RNA-seq FASTA reads and perform alignment to mRNA transcripts fr the purposes of identifying genes covered by your reads.
tarek.mohamed : You say that you have SAM/BAM files for the samples so one way to do this would be to generate consensus sequence across the gene boundaries and then do dot plots with those sequences. That should give you an idea of percent identity.
This recent package (FlexiDot: highly customizable, ambiguity-aware dotplots ) may be of interest.