Question: Extract gRNA sequence using cutadapt
0
gravatar for Swimming bird
25 days ago by
Swimming bird20 wrote:

I want to analyse data from a CRISPR/Cas9 screen (control vs. treatment) and I'm using Mageck (https://sourceforge.net/projects/mageck/). Mageck calculates the trim length of reads automatically but in case the trim is variable the program recommends the user to use cutadapt.

I used cutadapt to remove the 5' sequence in front of the gRNA (20 nt) of interest (there are 12 different guides) but some reads with the complete gRNA are not trimmed because there are big deletions. I tried to increase the maximum error rate of cutadapt but the Mageck gRNA count obtained after the trimming is sligthly smaller than the obtained searching each gRNA with grep.

Is it possible to improve the trimming in order to only obtain the gRNAs sequences (the 3' is not necessary to be eliminated)?

This topic is similar to my problem: https://github.com/marcelm/cutadapt/issues/261

ADD COMMENTlink written 25 days ago by Swimming bird20

If I am understanding this right you may be able to use bbduk.sh from BBMap suite. Find the gRNA sequence using literal=gRNA_sequence ktrim=l. This will find the gRNA and then remove everything 5' to that hit.

ADD REPLYlink modified 25 days ago • written 25 days ago by genomax75k

The problem is that I have to analyse data generated by 60,000 gRNAs (all of them have to be searched in each fastq). Is it possible to do it with that number of gRNAs?

ADD REPLYlink written 25 days ago by Swimming bird20

Should be possible. Put the sequences as multi-fasta in a file and then feed the file to bbduk.sh with ref=gRNA_file.fa. Bump the memory you assign by setting a higher number with -Xms20g (for example).

ADD REPLYlink modified 25 days ago • written 25 days ago by genomax75k

This code works (it finds the sequences) but it removes the gRNA sequence... I would be grateful if you could answer again.

ADD REPLYlink written 17 days ago by Swimming bird20
1

Take a look at BBduk guide here to see if you can make it work for your use case. Sounds like you want to keep the gDNA sequence so this is not quite the intended use for bbduk in trim mode.

ADD REPLYlink written 17 days ago by genomax75k

Yes. that's the problem but I took a look to the guide and I didn't find any option to do what I want to do.

ADD REPLYlink written 17 days ago by Swimming bird20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1006 users visited in the last hour