Question: Extract gRNA sequence using cutadapt
0
gravatar for Swimming bird
12 months ago by
Swimming bird20 wrote:

I want to analyse data from a CRISPR/Cas9 screen (control vs. treatment) and I'm using Mageck (https://sourceforge.net/projects/mageck/). Mageck calculates the trim length of reads automatically but in case the trim is variable the program recommends the user to use cutadapt.

I used cutadapt to remove the 5' sequence in front of the gRNA (20 nt) of interest (there are 12 different guides) but some reads with the complete gRNA are not trimmed because there are big deletions. I tried to increase the maximum error rate of cutadapt but the Mageck gRNA count obtained after the trimming is sligthly smaller than the obtained searching each gRNA with grep.

Is it possible to improve the trimming in order to only obtain the gRNAs sequences (the 3' is not necessary to be eliminated)?

This topic is similar to my problem: https://github.com/marcelm/cutadapt/issues/261

ADD COMMENTlink written 12 months ago by Swimming bird20

If I am understanding this right you may be able to use bbduk.sh from BBMap suite. Find the gRNA sequence using literal=gRNA_sequence ktrim=l. This will find the gRNA and then remove everything 5' to that hit.

ADD REPLYlink modified 12 months ago • written 12 months ago by genomax92k

The problem is that I have to analyse data generated by 60,000 gRNAs (all of them have to be searched in each fastq). Is it possible to do it with that number of gRNAs?

ADD REPLYlink written 12 months ago by Swimming bird20

Should be possible. Put the sequences as multi-fasta in a file and then feed the file to bbduk.sh with ref=gRNA_file.fa. Bump the memory you assign by setting a higher number with -Xms20g (for example).

ADD REPLYlink modified 12 months ago • written 12 months ago by genomax92k

This code works (it finds the sequences) but it removes the gRNA sequence... I would be grateful if you could answer again.

ADD REPLYlink written 12 months ago by Swimming bird20
1

Take a look at BBduk guide here to see if you can make it work for your use case. Sounds like you want to keep the gDNA sequence so this is not quite the intended use for bbduk in trim mode.

ADD REPLYlink written 12 months ago by genomax92k

Yes. that's the problem but I took a look to the guide and I didn't find any option to do what I want to do.

ADD REPLYlink written 12 months ago by Swimming bird20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1681 users visited in the last hour