Question

shRNA libraries design from large datasets

0

Entering edit mode

7.9 years ago

bioinformatics21 • 0

Hi everyone, I would like to create a pool of shRNA sequences (20nts each) for the specific set of 1000 genes, making it up to 30000 sequences in total, the gene sequences are available in Genbank, but I don't have a single file with all of them together. There are online tools that design shRNAs for single genes, but I can't figure out how to do this for 1000 sequences. My confusion comes from the fact that although I perfectly see the task, I'm new to the bioinformatics field and do not immediately see the way to design such pipeline.

I would appreciate it if you can provide some answers to my questions: 1. Where do I start? 2. What software/tool or combination of tools should I use for these large datasets? 3. What is the format of the input file? Is there a tool to assembly such file automatically or it has to be done manually?

Thank you very much!!!

gene sequence • 1.3k views

ADD COMMENT • link updated 7.9 years ago by Jean-Karim Heriche 27k • written 7.9 years ago by bioinformatics21 • 0

1

Entering edit mode

Which species are you targeting ? For such a large number, it might be cheaper to use an existing library if available.

ADD REPLY • link 7.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Thank you for your reply. I'm planning to target diatom algae.

ADD REPLY • link 7.9 years ago by bioinformatics21 • 0

score 2 · Answer 1 · 2016-05-29

First define a reference genome. I would suggest working with Ensembl for this if possible because the data is well integrated/organized and the API makes it easy to work with for all aspects of a large screen. For the design of the library itself, you have several possibilities: ask the authors for the software behind one of the web tools, script one of the online tools if possible, i.e. send it your list of target genes either one at a time or in batch if possible (without hammering their server) or write your own script implementing publicly available rules, e.g. here. You could also use a tool for siRNA design and post-process the result to get shRNAs.
A list of tools for RNAi reagents design is available here.
You may also find this review useful.

If you're going to order the shRNAs from a company, most of them have their own design pipeline, you usually only need to send them your target genes. However, before buying, I would review the proposed sequences and check that they indeed match the target genes in your reference genome, e.g. the genome of the cell line you're going to work with or the genes in Ensembl if working with Ensembl and that there are no off-target genes.