Biscuit Alignment and Index failed - WGBS - Bisulfite analysis
0
0
Entering edit mode
2 days ago

For tool installation, i tried :

conda create -n biscuit biscuit

For my model reference genome I have downloaded the .fa file and used the below comment for index:

biscuit index GCF_001704415.2_ARS1.2_genomic.fasta

I keep on getting the same error when tried alignment:

Comment :

biscuit align -@ 30 -R "@RG\\tID:PK\\tSM:Yoda\\tPL:MGI\\tPU:Lane1\\tLB:MGI" SRR17521734_1.fastq SRR17521734_2.fastq /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/GCF_001704415.2_ARS1.2_genomic.fasta | dupsifter GCF_001704415.2_ARS1.2_genomic.fasta -o SRR17521734_1.normal.bam

Error:

[E::bwa_idx_load_from_disk] fail to locate the index files
Segmentation fault (core dumped)
  • I have tried with additional options like using bwa index, samtools index, samtools faidx, and gatk CreateSequenceDictionary for other relevant files.
  • However, the issue persists even after all the new relevant index files and using the same provided tool index comment. I have also tried to provide the full location link, .fai link and even repeat the entire process. But nothing has worked so far.

Kindly, provide solutions or reason behind the issue.

BWA Biscuit Bisulfite • 254 views
ADD COMMENT
1
Entering edit mode

Can you show us the output of

ls -lh /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/

If your index was correctly made (looks like biscuit uses bwa under the covers to create the reference index) then there should be several files there that start with GCF_001704415.2_ARS1.2_genomic.fasta.

ADD REPLY
0
Entering edit mode

Sure, thanks for the reply.

-rw------- 1 folder folder 101M Feb 17 2024 cds_from_genomic.fna -rw------- 1 folder folder 2.8G Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta -rw-rw-r-- 1 folder folder 12K Mar 13 10:26 GCF_001704415.2_ARS1.2_genomic.fasta.amb -rw-rw-r-- 1 folder folder 4.0M Mar 13 10:26 GCF_001704415.2_ARS1.2_genomic.fasta.ann -rw-rw-r-- 1 folder folder 12K Mar 12 17:27 GCF_001704415.2_ARS1.2_genomic.fasta.bis.amb -rw-rw-r-- 1 folder folder 4.0M Mar 12 17:27 GCF_001704415.2_ARS1.2_genomic.fasta.bis.ann -rw-rw-r-- 1 folder folder 697M Mar 12 18:35 GCF_001704415.2_ARS1.2_genomic.fasta.bis.pac -rw-rw-r-- 1 folder folder 5.6G Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t -rw-rw-r-- 1 folder folder 11G Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.0123 -rw-rw-r-- 1 folder folder 25K Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.amb -rw-rw-r-- 1 folder folder 7.9M Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.ann -rw-rw-r-- 1 folder folder 1.4G Feb 17 2024 GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.pac -rw-rw-r-- 1 folder folder 2.8G Mar 13 10:25 GCF_001704415.2_ARS1.2_genomic.fasta.bwt -rw-rw-r-- 1 folder folder 2.8G Mar 12 18:34 GCF_001704415.2_ARS1.2_genomic.fasta.dau.bwt -rw-rw-r-- 1 folder folder 1.4G Mar 12 19:03 GCF_001704415.2_ARS1.2_genomic.fasta.dau.sa -rw-rw-r-- 1 folder folder 5.0M Mar 13 10:56 GCF_001704415.2_ARS1.2_genomic.fasta.dict -rw-rw-r-- 1 folder folder 1.1M Mar 13 10:53 GCF_001704415.2_ARS1.2_genomic.fasta.fai -rw-rw-r-- 1 folder folder 697M Mar 13 10:26 GCF_001704415.2_ARS1.2_genomic.fasta.pac -rw-rw-r-- 1 folder folder 2.8G Mar 12 18:33 GCF_001704415.2_ARS1.2_genomic.fasta.par.bwt -rw-rw-r-- 1 folder folder 1.4G Mar 12 18:49 GCF_001704415.2_ARS1.2_genomic.fasta.par.sa -rw-rw-r-- 1 folder folder 1.4G Mar 13 10:40 GCF_001704415.2_ARS1.2_genomic.fasta.sa -rw------- 1 folder folder 3.7G Feb 17 2024 genomic.gbff -rw------- 1 folder folder 317M Feb 17 2024 genomic.gff -rw------- 1 folder folder 429M Feb 17 2024 genomic.gtf drwxrwxr-x 2 folder folder 4.0K Mar 13 15:27 new_ref -rw------- 1 folder folder 32M Feb 17 2024 protein.faa -rw------- 1 folder folder 157M Feb 17 2024 rna.fna -rw-rw-r-- 1 folder folder 7.7G Feb 16 2024 SRR17521734_1.fastq -rw-rw-r-- 1 folder folder 7.7G Feb 16 2024 SRR17521734_2.fastq

I have created almost all possible sub files to be used later or related to index.

ADD REPLY
0
Entering edit mode

As individual files as below :

cds_from_genomic.fna

GCF_001704415.2_ARS1.2_genomic.fasta

GCF_001704415.2_ARS1.2_genomic.fasta.amb

GCF_001704415.2_ARS1.2_genomic.fasta.ann

GCF_001704415.2_ARS1.2_genomic.fasta.bis.amb

GCF_001704415.2_ARS1.2_genomic.fasta.bis.ann

GCF_001704415.2_ARS1.2_genomic.fasta.bis.pac

GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t

GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.0123

GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.amb

GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.ann

GCF_001704415.2_ARS1.2_genomic.fasta.bwameth.c2t.pac

GCF_001704415.2_ARS1.2_genomic.fasta.bwt

GCF_001704415.2_ARS1.2_genomic.fasta.dau.bwt

GCF_001704415.2_ARS1.2_genomic.fasta.dau.sa

GCF_001704415.2_ARS1.2_genomic.fasta.dict

GCF_001704415.2_ARS1.2_genomic.fasta.fai

GCF_001704415.2_ARS1.2_genomic.fasta.pac

GCF_001704415.2_ARS1.2_genomic.fasta.par.bwt

GCF_001704415.2_ARS1.2_genomic.fasta.par.sa

GCF_001704415.2_ARS1.2_genomic.fasta.sa

genomic.gbff

genomic.gff

genomic.gtf

new_ref

protein.faa

rna.fna

SRR17521734_1.fastq

SRR17521734_2.fastq

ADD REPLY
0
Entering edit mode

At first glance all necessary files seem to be present.

A couple of suggestions. Since you appear to be using an external (USB?) drive I would suggest dropping the number of cores down significantly. That external drive is likely not going to provide enough file I/O to support 30 threads/cores.

As for the actual command, are you in the directory where these files are when you run the command (asking since you seem to be using full paths and relative paths in a mixed fashion).

You may want to try and see if the program runs with following command line.

biscuit align -@ 4 -R "@RG\\tID:PK\\tSM:Yoda\\tPL:MGI\\tPU:Lane1\\tLB:MGI" /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/SRR17521734_1.fastq /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/SRR17521734_2.fastq /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/GCF_001704415.2_ARS1.2_genomic.fasta | dupsifter /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/GCF_001704415.2_ARS1.2_genomic.fasta -o /mnt/NGS4/Tools-testing/BS-projects/goat/srafiles/Goat_genome/SRR17521734_1.normal.bam
ADD REPLY
0
Entering edit mode

Thanks again for your effort and reply.

As mentioned earlier, I tried multiple ways, but somehow it was not working.

I have tried already the same comment you have provided earlier, but still same error I am getting.

I am a bit confused about this, hence it's a simple direct method and yet giving errors to me.

ADD REPLY
0
Entering edit mode

Segmentation fault (core dumped)

Next obvious question is how much free memory you have available on the machine you are doing this. Generally seg faults are related to "not enough" memory. While bwa has lower memory requirements in general, you may need at least 20-30 GB of RAM with even 4 threads.

ADD REPLY
0
Entering edit mode

Well, that is the weird part here. I run it in a server which has 252 GB RAM, 80 threads, and 1 TB HDD space.

So, space, RAM and threads are not the issue here. Hence, a direct file link is also provided. The issue might be something else. I hope.

Anyway, I will try to use another version of Biscuit or another genome for testing purposes.

ADD REPLY
0
Entering edit mode

Is the /mnt path referring to an external drive or is it just a mount point for storage. If the files are on an external drive I suggest you move them to internal storage and try again.

ADD REPLY

Login before adding your answer.

Traffic: 2153 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6