Entering edit mode
2.4 years ago
anikcropscience
▴
230
Hello, I am running now a local blast pipeline using MacOs. The goal here is to take interval of the 5 best hits and then extract the SNP variants from multiple vcf.gz files. But I am facing an error which I cannot solve. Below are the codes that I used:
#!/bin/bash
##Functions to automate if start position is bigger than stop position, then switch the interval into stop first then start.
#1 hit blast n
REF=/Users/path/to/the/reference/genome
DBBLASTN=/Users/path/to/the/respective folder/
QUERY=/Users/path/to/the/folder/containing/gene/fasta/of/interest
OUTPUT=/Users/path/to/output
OUTPUTBLASTN=/Users/path/to/output/blastn/result
FASTAOUTPUT=/Users/path/to/output/blastn/result/Output.blastn.fasta
blast_n_one_hit () {
#1st hit blastn result fasta file extraction
if [ $START1 -gt $STOP1 ]
then
samtools faidx $REF $SEQ1:$STOP1-$START1 > $FASTAOUTPUT/1.hit.$OUTPUTNAME
else
samtools faidx $REF $SEQ1:$START1-$STOP1 > $FASTAOUTPUT/1.hit.$OUTPUTNAME
fi
}
#2 hit blast n
blast_n_two_hit () {
blast_n_one_hit
#2nd hit blastn result fasta file extraction
if [ $START2 -gt $STOP2 ]
then
samtools faidx $REF $SEQ2:$STOP2-$START2 > $FASTAOUTPUT/2.hit.$OUTPUTNAME
else
samtools faidx $REF $SEQ2:$START2-$STOP2 > $FASTAOUTPUT/2.hit.$OUTPUTNAME
fi
}
#How to extract the fasta file from the reference file (S.chilense) within the specific interval.
#Specific interval is the interval from the best 5 hits blastn output
create_output() {
ONAME=$(basename "$1")
echo $ONAME
echo "$OUTPUTNAME"
echo "--*****--"
}
## end function declaration
#We need to cut the column 2,9 and 10 from the tabular output from blastn results. Col 2: Sequence ID, Col 9: Sequence star, Col10: Sequence stop
#The following command is to make the variables, which are needed for making the interval to extract the sequence from the reference
#Variables for the first top hit
for file in $OUTPUTBLASTN/*.new.vs.ref; do #Change here
OUTPUTNAME=${ONAME%.new.vs.ref}.fasta #Change here for the new files
lineNumber = wc -l $file
create_output $file
echo "First top hit"
SEQ1=$(cut -f 2 "$file" | head -1 | tail -1)
echo "Sequence name:$SEQ1"
START1=$(cut -f 9 "$file" |head -1 | tail -1)
echo "Seqstart:$START1"
STOP1=$(cut -f 10 "$file" | head -1 | tail -1)
echo "Seqstop:$STOP1"
echo "--*****--"
if [[lineNumber <= 2]] then
blast_n_one_hit
continue
fi
#Variables for the second top hit
echo "Second top hit"
SEQ2=$(cut -f 2 "$file" | head -2 | tail -1)
echo "Sequence name:$SEQ2"
START2=$(cut -f 9 "$file" | head -2 | tail -1)
echo "Seqstart:$START2"
STOP2=$(cut -f 10 "$file" | head -2 | tail -1)
echo "Seqstop:$STOP2"
echo "--*****--"
if [[lineNumber <= 3]] then
blast_n_two_hit
continue
fi
#Variables for the third top hit
echo "Third top hit"
SEQ3=$(cut -f 2 "$file" | head -3 | tail -1)
echo "Sequence name:$SEQ3"
START3=$(cut -f 9 "$file" | head -3 | tail -1)
echo "Seqstart:$START3"
STOP3=$(cut -f 10 "$file" | head -3 | tail -1)
echo "Seqstop:$STOP3"
echo "--*****--"
if [[lineNumber <= 4]] then
blast_n_three_hit
continue
fi
echo "Extracting the fasta files from the reference is finished!"
done
So, I run this code and then end up with the following error:
zsh: lineNumber command not found
zsh: create_output command not found
Can anybody please tell me how can I correct that?
zsh: lineNumber command not found
... looks like your command was executed using zsh, not bash
you want
you want
(...)
Thank you very much for the reply. But what about the other problem:
I cannot see what is the problem here.
your functions are not aligned to the left.