Question: Which Aligner Is Best Suited For Clip-Seq Data?
2
gravatar for Mathew Bunj
6.9 years ago by
Mathew Bunj40
Mathew Bunj40 wrote:

I have an RNAseq data (CLIP-seq) and want to find out RNA-binding partner to my proteins of interest. The data is PE 2X50. Which aligner will be most suitable to detect RNAs in my sequencing data?

Thanks.

alignment rna sequencing • 5.0k views
ADD COMMENTlink modified 6.8 years ago by 141341254653464453.5k • written 6.9 years ago by Mathew Bunj40

Hi @Mathew Bunj, is your CLIP-seq dataset of the kind that generates mutations (e.g. UV light protocol) with respect to the reference genome?

ADD REPLYlink written 6.8 years ago by 141341254653464453.5k
1

The procedure include UV cross link and yes it may be possible it can generate soem mutations particularly T to C. Do You have any suggestion?

ADD REPLYlink written 6.8 years ago by Mathew Bunj40
4
gravatar for Ryan Dale
6.9 years ago by
Ryan Dale4.8k
Bethesda, MD
Ryan Dale4.8k wrote:

I don't see why a spliced aligner (e.g. TopHat, but see recent question Is tophat the only mapper to consider for RNA-seq data?) wouldn't work for CLIP-seq/HITS-CLIP. A quick search for papers using the technique (1, 2, 3) shows they are not consistent (BLAT, a custom aligner, or MosaikAligner). Can't hurt to try different aligners, I suppose.

There may be other things to be careful about besides aligner choice. For example, from this protocol, it looks like there's a digestion step involved. I'm not sure if this means the fragments you end up sequencing are necessarily small . . . but depending on the experimental protocol used you may need to be careful about insert sizes for the PE reads and/or trimming adapter sequence.

ADD COMMENTlink written 6.9 years ago by Ryan Dale4.8k

I wonder will TopHat identify RNA?

ADD REPLYlink modified 6.9 years ago • written 6.9 years ago by kanwarjag1.0k
3
gravatar for 14134125465346445
6.8 years ago by
United Kingdom
141341254653464453.5k wrote:

If your protocol includes UV cross link and can generate T to C mutations in some of the reads, but not all of them, one possibility is to assemble the read clusters first, then align the region under the peak of reads to the reference genome. Pinball does just that:

Pinball is an alignment-free ChIP-seq and HITS-CLIP analysis tool:
https://github.com/avilella/pinball/blob/master/INSTALL

If you want to skip installation and set up, you can try the virtual machine here:
ftp://ftp.ebi.ac.uk/pub/databases/ensembl/avilella/pinball/PinballVM.1.0.4.ova
The installation procedure of the virtual machine is the same as described here:
http://www.ensembl.org/info/data/virtual_machine.html

Depending on your read length, you may want to tweak the --error-rate parameter, to allow reads with T/C or other mutations to still align with mismatches. For example, if you have 36bp reads, require a 2/3 of the read length for overlap=24bp, and want to allow 1 mismatch every 24bp, you can set --error-rate=0.042 (>1/24).

Hope it helps.

ADD COMMENTlink modified 6.8 years ago • written 6.8 years ago by 141341254653464453.5k
1

could you maybe check permissions on your VM download link? I cant download it.

ADD REPLYlink written 6.8 years ago by Ido Tamir5.0k

Thanks for the heads up, I chmod'ed the files now.

ADD REPLYlink written 6.8 years ago by 141341254653464453.5k
1

I installed the VM but it is giving me two errors- missing the checkout Variation Missing the checkout Funcgen

ADD REPLYlink written 6.8 years ago by kanwarjag1.0k
1
gravatar for UnivStudent
6.8 years ago by
UnivStudent380
Canada
UnivStudent380 wrote:

I would take a look at this paper that explains the data analysis and some considerations to make in order to find single-bp resolution binding sites: Mapping in vivo protein-RNA interactions at single-nucleotide resolution from HITS-CLIP data by Zhang & Darnell.

ADD COMMENTlink written 6.8 years ago by UnivStudent380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1860 users visited in the last hour