Plotting .tab/.bam file
Entering edit mode
4.7 years ago

Hey all,

I am trying to find a way to create a scatter plot and histogram using matplotlib for an alignment I generated. I aligned my reads to a bacterial genome and I indexed and sorted the file, and used using:

samtools view -b s_oneidensis_alignemnt_sensitive.sam > alignment.bam
samtools sort alignment.bam > alignment.sorted.bam
samtools index alignment.sorted.bam
samtools depth -a alignment.sorted.bam >

Now I'd like to generate a scatterplot with x-axis = position in genome and y-axis = depth of coverage and then a histogram with x-axis = depth of coverage and y-axis = read count. I'm still new to python and trying to figure out a method using the .tab file or should I use the .bam file? Any help or nudges in the right direction would be greatly appreciated. Thanks!

matplotlib tab python bam • 2.6k views
Entering edit mode

A similar topic has been discussed in How to plot coverage and depth statistics of a bam file. Tab file (.tab) is just another text file where the columns are tab-separated, read the file using pandas and plot using pyplot.

Entering edit mode

So the issue I'm having with this is extracting the alignment .tab file's columns into a list my current code is its indexing the first strings indexed in a row not the column (NCBI ascension ID for the genome) such as A and E with this code:

    %matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

x = []
y = []

table = pd.read_csv('', sep='\t')
alignment = pd.DataFrame(data=table)
for column in alignment:

plt.plot(x, y, 'ro')
plt.xlabel('Position in Genome')
plt.ylabel('Depth of Coverage')

Login before adding your answer.

Traffic: 2990 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6