Script a regions file for crispressoWGS
1
0
Entering edit mode
2.2 years ago

Hello,

has anyone created a script to make a regions file for CRISPressoWGS from the covered regions of a BAM file ?

It would be great if something is available, else I'll write one myself.

Background : this tool needs a regions file (extended, tab separated BED file with unique region IDs). Regions must be shorter than read length (or perhaps PE read length, so roughly fragment length, to be counted).

crispr crispressoWGS crispresso • 483 views
ADD COMMENT
0
Entering edit mode
2.2 years ago

So I solved this myself, it wasn't too tricky after all. Why not reinvent the wheel eh ?

Sambamba bash script

#!/bin/bash
# Check window coverage on Wochenende sorted dup.bam output

# Actually run for each BAM file
for i in *dup.bam; do
        input=$i
        sec_input=${input%%.bam}
        #sec_in_bam=${input%%.bam}

        window=150
        overlap=75
        covMax=999999999

        # Get coverage depth in windows
        # x threads, Windows 100000, overlap 50000, -c minimum coverage.
        # SLURM
        $scheduler $path_sambamba depth window -t $THREADS --max-coverage=$covMax --window-size=$window --overlap $overlap -c 10 ${sec_input}.bam > ${sec_input}_cov_window.txt &

done
wait

Followup awk script

cat awk_crispressoWGS_regions.sh

#/bin/bash
# Feb 2022, make regions for https://github.com/pinellolab/CRISPResso2 wgs

#example target format tab separated
#chr6 51002798 51002820 R2 NA NA NA
awk -F "\t" '{print $1 "\t" $2 "\t" $3 "\tR" FNR "\tNA" "\tNA" "\tNA"}' yourBAMfilename.trm.ns.fix.s.dup_cov_window.txt > tmp.txt

# remove header on line 1
grep -v "#" tmp.txt > WGS_regions.txt
ADD COMMENT

Login before adding your answer.

Traffic: 3093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6