Problem with bowtie2 alignment -
Entering edit mode
6 weeks ago
Ivan ▴ 10

In a nutshell, I have 44 folders of different samples/species that each have paired reads for those samples/species. I'm doing bowtie alignment with the same referent genome, and then outputing it to BAM and sorting it using samtools. Since alignment takes a while, I've written a script and passed it to Sun Grid Engine job manager with parallel job options. The code for that is:

#$ -N bowtiejob
#$ -V
#$ -t 1-44   #this creates task IDs, numeric indexes

read -a samples  <<< $(cut -d , -f 3 linker.csv | tail -n +2)   #list of all sample folders
false_index=$SGE_TASK_ID   #this is just for ease of writing
ref_path="ref_genome/ref_genome_btindex" #this is my Bowtie index ref genome

With the above code, I ensure that every sample folder is assigned unique SGE_ID so that parallel jobs don't interact with one another.

The code for bowtie and samtools is following:

bowtie2 -x $ref_path -1 $folder/*_1.fastq -2 $folder/*_2.fastq | samtools view -bS - > "bowtie/$folder.sat_ref.bam"

samtools sort bowtie/$folder.sat_ref.bam -o bowtie/$folder.sorted.bam2

The paired reads are found in separate folders, and referent genome index built with bowtie2-build is specified in $ref_path. The first line of code passes bowtie result to SAM, then SAM is converted in BAM. The second line of code creates a sorted BAM file from BAM file.

The script is literally the same, it's executed the same, with just a difference in $SGE_TASK_ID. However, some samples run as expected, while some don't. For example


The first sample has 0 KB, while the second one has 13M Kb. Seeing the error output and output for failed sample - the output is empty and the error output is :

/usr/local/bin/bowtie2-align-s: error while loading shared libraries: cannot open shared object file: No such file or directory
(ERR): Description of arguments failed!
Exiting now ...
samtools: /lib64/ version `GLIBC_2.14' not found (required by samtools)
samtools: /lib64/ version `GLIBC_2.14' not found (required by samtools)

Likewise, the error output for the good sample is empty, while output for it is :

64055526 reads; of these:
  64055526 (100.00%) were paired; of these:
    28974485 (45.23%) aligned concordantly 0 times
    27752984 (43.33%) aligned concordantly exactly 1 time
    7328057 (11.44%) aligned concordantly >1 times

My guess is that job failed because of whatever is causing this error

 /usr/local/bin/bowtie2-align-s: error while loading shared libraries: cannot open shared object file: No such file or directory
(ERR): Description of arguments failed!

But I have no idea why it failed for some samples, and succeeded for other.

samtools shell manager bowtie2 job • 214 views
Entering edit mode
6 weeks ago
GenoMax 111k

why it failed for some samples, and succeeded for other.

Are you using a private copy of bowtie2 from your own directories? Only explanation I can think of is your cluster must have nodes set up differently. Some nodes may be running older versions of the OS and are missing libraries or links to them. Can you check to see if there is any pattern in failed jobs and nodes they ran on? Does resubmission of failed jobs make them work (on different nodes)?

Entering edit mode

I'm using a copy from my own directory. I actually tested it first locally, and I've written a script just now to test whether there is any difference in files. Code:

bowtie2 -x $ref_genome -1 $folder/*_1.fastq -2 $folder/*_2.fastq

This works. When I replace "GoodSample" with "BadSample" (the one that failed in job), I also get good results, so I'm guessing you're right.

However, using -V flag when submitting job exports all local variables, and my cluster doesn't specify that it has bowtie installed. Is there any way to force cluster to use my own bowtie copy? It would be quite tedious if I had to recheck what ran and what didnt?

Entering edit mode

These jobs must indeed be using your own copy. It is the interaction of that executable with the OS that is causing this issue. You can try and restrict your jobs to nodes that seem to have the right system libraries or see if the local admins can help by ensuring that all nodes have the right libraries.


Login before adding your answer.

Traffic: 1595 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6