Question: Plotting Reads Around Tss
6
gravatar for ChIP
6.4 years ago by
ChIP550
Netherlands
ChIP550 wrote:

Hi!

I am almost sure, guys who are actively involved in ChIP-Seq data analysis have plotted the mapped reads around the TSS in a certain window(say 10 Kb).

What I want to do is to plot reads of my histone marks (in bam file) around TSS with CpG and TSS without CpG (Essentially a coverage profile).

It would be very kind of you, if you could share the script that you had used for the same.

Kindly help.

chip-seq • 10k views
ADD COMMENTlink modified 4.7 years ago by Ram130 • written 6.4 years ago by ChIP550

I would not recommend to plot raw read counts directly from BAM as this may give you false impressions due to biases. For example, promoters by default have more reads do to GC and open chromatin bias. Thus, it is better to use log2 ratios or the difference between treatment and input. 

ADD REPLYlink written 4.7 years ago by Fidel1.9k
5
gravatar for Ryan Dale
6.4 years ago by
Ryan Dale4.9k
Bethesda, MD
Ryan Dale4.9k wrote:

The Python package metaseq (docs) will do this directly from BAM files. Here's an example gist showing how you'd use it:

(edit: fix embedded gist)

Also, you may want to see this worked example from the docs: https://pythonhosted.org/metaseq/example_session.html

ADD COMMENTlink modified 10 weeks ago by RamRS25k • written 6.4 years ago by Ryan Dale4.9k

Thank you for the script. Could you please, tell me how to install this package metaseq.

ADD REPLYlink written 6.4 years ago by ChIP550

I updated the script to have some installation instructions. It depends on the scientific Python stack, which can be troublesome to install -- see http://www.scipy.org/install.html if you run into issues.

ADD REPLYlink written 6.4 years ago by Ryan Dale4.9k

Hi! I have a unique error with this line

tss = tss.window(w=5000)

and as soon as I mark this line as commented, the whole script works fine.

The error says

Command was:

bedtools window -a tss.bed -w 5000

Error message was:

*****ERROR: Need -a and -b files.

I think, the command should be:

tss = tss.window(cpg,w=5000)

Could you be kind enough to suggest an edit in your script.

Thank you

ADD REPLYlink modified 10 weeks ago by RamRS25k • written 6.4 years ago by ChIP550

Sorry, I had used the wrong BEDTools program -- it should be "slop" instead of "window". The intent was to extend single-bp TSS feature out to a total size of 10kb. Thanks for catching this; I edited the script.

ADD REPLYlink written 6.4 years ago by Ryan Dale4.9k

Hi. I am a bit of a newbie, so I apologise in advance, however I haven't found any answers elsewhere. I have tried to run the script above, but when I do

>>> y1 = bam.array(tss_with_cpg, bins=bins, processes=processes, fragment_size=fragment_size)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/metaseq/_genomic_signal.py", line 122, in array
    chunksize=chunksize, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/metaseq/array_helpers.py", line 382, in _array_parallel
    itertools.repeat(kwargs)))
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 227, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 528, in get
    raise self._value
TypeError: 'NoneType' object is not iterable

I have tried with several bam files. Why is the bam metaseq._genomic_signal.BamSignal object not iterable? Or have I missed something?

Thank you very much in advance.

ADD REPLYlink modified 10 weeks ago by RamRS25k • written 5.5 years ago by adira.mollari0
1

Lots of improvements have been made to metaseq since this was first posted, and the git branch referenced in the script is no longer valid.

I've just updated the script with 1) a reference to the latest installation instructions and 2) instructions for getting example data, so you don't need your own data to run this.  Please report back (either here or on the github issues page) if this still doesn't work for you.

ADD REPLYlink written 5.5 years ago by Ryan Dale4.9k

Will definitely try this now, works directly from BAM!

BTW, if I want to subtract the forward-reverse reads on BAM, especially for histone Chip-seq, how do I get the subtracted read from BAM. Then this subtracted BAM file could be used for plotting.

ADD REPLYlink modified 10 weeks ago by RamRS25k • written 5.5 years ago by Chirag Nepal2.2k

Not sure what you mean by "subtract the forward-reverse reads".  But you can manipulate your BAM using other means (e.g., samtools) and then read the new BAM with metaseq as in the example.

Also, see the BamSignal.local_coverage docs for other arguments that might be helpful for you -- specifically fragment_size, shift_width, read_strand, and stranded.

ADD REPLYlink written 5.5 years ago by Ryan Dale4.9k
1
gravatar for Chirag Nepal
6.4 years ago by
Chirag Nepal2.2k
Copenhagen
Chirag Nepal2.2k wrote:

You can have a look here http://crazyhottommy.blogspot.no/2013/04/how-to-make-tss-plot-using-rna-seq-and.html

cheers

ADD COMMENTlink written 6.4 years ago by Chirag Nepal2.2k

I essentially learned that from the HTSeq package document http://www-huber.embl.de/users/anders/HTSeq/doc/tss.html#tss note that, the y axis is not normalized to count per million (cpm), for that purpose, one needs to count the total reads number (you can do it with HTSeq,but it is rather slow, samtools will be better), and divide the numpy array by it.

ADD REPLYlink written 6.4 years ago by Ming Tang2.5k
1
gravatar for KCC
6.4 years ago by
KCC4.0k
Cambridge, MA
KCC4.0k wrote:

Assuming you have your TSS and CpG files in BED format. You can use either a command from bedtools like intersectBED to get TSS which have CpG regions and ones that don't. Then you use an online tool like Cistrome or CEAS to get a profile across your BED regions.

Sorry, I can't be more specific as you haven't mentioned any of the file formats that you are using. I can adjust my answer if you do.

ADD COMMENTlink written 6.4 years ago by KCC4.0k

It is a BAM file.

ADD REPLYlink written 6.4 years ago by ChIP550

Seqminer should do what you are looking for

ADD REPLYlink written 6.4 years ago by kanwarjag1.0k
0
gravatar for Ian
6.4 years ago by
Ian5.6k
University of Manchester, UK
Ian5.6k wrote:

I recently discovered NGS PLOT on biostars. It is R-based and can work from the command line or via GALAXY. https://code.google.com/p/ngsplot/

ADD COMMENTlink written 6.4 years ago by Ian5.6k
0
gravatar for bede.portz
6.4 years ago by
bede.portz490
United States
bede.portz490 wrote:

I think HOMER may be able to do what you are asking.

HOMER manual for annotatePeaks.pl

If you carry out the detailed annotation, the output should contain information about the distance of peaks to many known genomic features, including CpG islands. I think you could parse the information you want from this file.

I provided a link to the manual for the command that may give you what you need so you can try to ascertain whether or not HOMER can provide the functionality you desire. I would suggest reading the preceding parts of the manual in its entirety, as HOMER will do some things like normalization that you may or may not want, and there are preceding steps you will need to carry out before using annotatePeaks.pl including downloaded the version of your genome of interest via HOMER, making tag directories, etc.

I wish I could provide more detailed and expert advice, but I am new to this analysis myself.

Someone suggested SeqMiner. My concern with SeqMiner is the poor documentation. It can generate heatmaps from data VERY quickly, which is attractive, but seems to be more of a black box than HOMER. I think SeqMiner give you less control over the analysis, and/or the control you do have is less intuitive, at least to me.

ADD COMMENTlink written 6.4 years ago by bede.portz490
0
gravatar for Ram
4.7 years ago by
Ram130
Germany
Ram130 wrote:

Dear Ryan,

Is it possible to know by using this script why there is shift of +2kb for particular histone mark but rather it has to be on TSS ?

Thanks a lot.

ADD COMMENTlink written 4.7 years ago by Ram130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1428 users visited in the last hour