Question: Quality metrics of ChIP-seq BAM files
0
gravatar for nash.claire
2.3 years ago by
nash.claire240
Canada
nash.claire240 wrote:

Hi all,

I'm a real beginner in the field of ChIP-seq analysis and bioinformatics in general (so please be patient) and wondered if you could help. I am trying to analyse my TFBS ChIP-seq data with Galaxy using the guidelines in the following paper:-

Bailey et al. "Practical guidelines for the comprehensive analysis of ChIP-seq Data", PLOS Computational Biology 2013

I have so far managed to do FastQC, trimming and grooming of my FASTQ data before aligning with Bowtie2. I now want to look at the quality metrics of my sequence reads specifically my library complexity. I can see from my BAM files that I have around 68% of uniquely mapped reads so maybe a little below ideal but I would like to proceed and look at the library complexity ie the number of genomic locations that my uniquely aligned reads map to. 

Where/how can I find this information and generate scores/ratios for this? Is there a tool in Galaxy that will do this for me? I tried the "Estimate Library Complexity" function but I didn't seem to get anything useful back from it. There seems to be a "Collect RNA-seq Metrics" function that also didn't give me what I'm looking for.

Is there something I'm missing in my BAM file or a tool that is not available in Galaxy to do this for me? I have absolutely no experience with R and would take me a long time to get up and running with it so any non-R related solutions would be greatly appreciated!

chip-seq • 1.1k views
ADD COMMENTlink modified 2.3 years ago by tangming20052.2k • written 2.3 years ago by nash.claire240
1

For what it's worth, our group (we do a LOT of ChIPseq) doesn't bother calculating library complexity, though we do use CollectAlignmentSummaryMetrics from Picard (that's in Galaxy). You might find the deepTools suite (available in galaxy and we also have a dedicated public Galaxy server) quite useful. This is primarily intended for QC and normalization of ChIPseq data and can give you a nice graphical depiction of things like, "Did my IP work and, if so, what sort of signal/peaks can I expect?".

Sadly, if you were doing this a month or two from now I'd just point you to the Galaxy workflow we're putting together for ChIPseq, but sadly that isn't done yet.

ADD REPLYlink written 2.3 years ago by Devon Ryan73k

HI Devon,

Thanks for the help that's great. For what it's worth, I'll still probably be trying to do this ChIP-seq analysis 2 months from now so it would be great if you could point me in the direction of your workflow when it's complete!! We have more samples being sequenced as we speak so I'll have more analyses to do.

 

I'll check out Picard and the deepTools suite as you recommend. Thank you!!!

ADD REPLYlink written 2.2 years ago by nash.claire240
1
gravatar for tangming2005
2.3 years ago by
tangming20052.2k
Houston/MD Anderson Cancer Center
tangming20052.2k wrote:

See my post here: https://github.com/crazyhottommy/ChIP-seq-analysis/blob/master/part0_quality_control.md

and if you use GUI, check Chance https://github.com/songlab/chance/wiki/CHANCE-Manual

ADD COMMENTlink written 2.3 years ago by tangming20052.2k

Hi Tangming,

Thank you for the link to the post. It looks pretty informative for a newbie like me!

ADD REPLYlink written 2.2 years ago by nash.claire240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 966 users visited in the last hour