MethylDackel Error running on HPC server
1
0
Entering edit mode
2.5 years ago
magicridge • 0

Hello,

I am trying to analyze data for RRBS (reduced representation bisulfite sequencing) and want to use BWA-METH for alignment. I also ran Bismark, but bismark output only shows mapping efficiency of 33.8% while BWA-METH shows 99.8% mapping efficiency (paired-end). So, I converted .sam to .bam with samtools and tried to run MehtylDackel. However, when I tried to run MethylDackel on the HPC server I got the following error.

[E::idx_find_and_load] Could not retrieve index file for 'sample.bam' Couldn't load the index for sample.bam, will attempt to build it. [E::hts_idx_push] Unsorted positions on sequence #25: 19733692 followed by 19733689 [E::sam_index] Read 'VH00301:6:AAAHKFLHV:1:1102:43283:1057' with ref_name='chr19', ref_length=58617616, flags=147, pos=19733689 cannot be indexed Couldn't build the index for sample.bam! File corrupted?

The script I submitted was

#!/bin/bash
#SBATCH --partition=batch                   # Partition name 
#SBATCH --job-name=MethyDackel                 # Job Name
#SBATCH --ntasks=1                              # Run a single task
#SBATCH --time=7-00:00:00                          # Time limit day-hrs:min:sec
#SBATCH --mem=64G                            # Memory per node (4GB); by default using M as unit
#SBATCH --cpus-per-task=4                       # Number of CPU cores per task
#SBATCH --output=%x.%j.out                      # Standard output log
#SBATCH --error=%x.%j.err                       # Standard error log
#SBATCH --export=NONE                           # Do not export any user's expliciti environment varialbes to compute node

cd $SLURM_SUBMIT_DIR                            # Change directory to job submission director

module load MethylDackel/0.6.0-foss-2019b

MethylDackel extract --keepDupes /DNAfa/hg38.fa /workDir/sample.bam

I tried to ran samtools to build index for the bam file and got

[E::hts_idx_push] Unsorted positions on sequence #25: 19733692 followed by 19733689 [E::sam_index] Read 'VH00301:6:AAAHKFLHV:1:1102:43283:1057' with ref_name='chr19', ref_length=58617616, flags=147, pos=19733689 cannot be indexed samtools index: failed to create index for "sample.bam": No such file or directory

At this point, I don't even know what I am doing wrong. Did the mapping fail? Should I put my trimmed fa file and/or reference genome in the same workdir (folder) as .sam/.bam file?

MethylDackel RRBS • 1.2k views
ADD COMMENT
1
Entering edit mode
2.5 years ago
ATpoint 81k

The file is unsorted as the error clearly indicates. Indexing requires coordinate-sorted file. Run samtools sort first.

ADD COMMENT
0
Entering edit mode

Wow. Thank you for the quick response. Obviously, I don't even know what I don't know. I will run Samtools and let you know how it goes.

Thanks!

ADD REPLY
0
Entering edit mode

It worked like a charm. I think I will still need to do a lot of optimizations with MethylDackel, but at least now I know I need to run sort and index before running the actual MethylDackel. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2870 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6