Percent identity plot
1
0
Entering edit mode
4.6 years ago
tarek.mohamed ▴ 350

Dear All

What are the available tools for plotting sequence reads percent identitiy over a certain genetic locus in multiple samples.

I have Sam/bam files for some samples and I want to compare and plot percent identity for one gene across all samples

precent identitiy alignment plot • 1.7k views
0
Entering edit mode

Do you have a citation for what you are aiming to do? Can you elaborate on what you mean by 'percent identity'?

Using BLAST, you should be able to determine the percentage of a particular gene that is covered by your input reads. For example, blastx can taken RNA-seq FASTA reads and perform alignment to mRNA transcripts fr the purposes of identifying genes covered by your reads.

0
Entering edit mode

tarek.mohamed : You say that you have SAM/BAM files for the samples so one way to do this would be to generate consensus sequence across the gene boundaries and then do dot plots with those sequences. That should give you an idea of percent identity.

This recent package (FlexiDot: highly customizable, ambiguity-aware dotplots ) may be of interest.

0
Entering edit mode
4.6 years ago
h.mon 34k

reformat.sh ( from the BBTools / BBMap package) has a number of metrics of interest for you:

Histograms for sam files only (requires sam format 1.4 or higher):

qahist=<file>           Quality accuracy histogram of error rates versus quality score.
indelhist=<file>        Indel length histogram.
mhist=<file>            Histogram of match, sub, del, and ins rates by read location.
ihist=<file>            Insert size histograms.  Requires paired reads interleaved in sam file.
idhist=<file>           Histogram of read count versus percent identity.
idbins=100              Number idhist bins.  Set to 'auto' to use read length.