How can I use the bwa aln
function and get the output in .bam
format.
I know bwasw
gives the output in .sam
so I just used SAMTOOLS to do the conversion but don't really know how to do it in this case as the output is in .sai
.
What I have been trying to do is this:
Create the indexes for all the
.FNA
files.Align them using
aln
command.Take this output and convert it to
.sam.
The sam files are very large to store. Hence convert them to
.bam
.Data abstraction, like counting the ranges of values can't be performed on the binary (.bam) file, hence I converted that to a
.txt
file using SAMTOOLS.Perform the desired data abstraction on the txt and get the results.
The tricky part is the placement of the files as I have to do this for about 1500 files. My code looks like this:
.././bwa index NC_008765.fna
.././bwa aln NC_008765.fna ../QUERY/updated_635_25bp.fasta > out.sai
../bwa samse NC_008765.fna out.sai ../QUERY/updated_635_25bp.fasta > aln.sam
../samtools-0.1.18/samtools-0.1.18/samtools view -bS -F 4 aln.sam > output.bam
../samtools-0.1.18/samtools-0.1.18/samtools view output2.bam > seemee.txt
awk '{ print $1 }' seemee.txt > comns.txt
sed 's/.\?.$//;s/$/00/;s/^00$/0/' comns.txt | sort | uniq -c | sort -nr > final.txt
I need to do this for all the 1000+ input files and 6 different query files. Does the general steps make sense? Can be it optimized? the code has been running on for 2 hours now and its only about 1/10th of its way through. Thanks for your help!
Dear dawnoflife, you do not need to convert to BAM to, in the end, convert it to a text file. SAM is already a text file, so do what you want to do with your seemee.txt file directly with your aln.sam file.
SAM files are too large. they take up 36 mb as compared to < 1 mb for text files. unless I can perform the
-F 4
operation when receiving the sam file itself.So just do not use "-bS -F 4" but "-S -F 4" , you're output will still be in SAM format (test), and then use it like your seemee.txt file
you could then convert to BAM when you're done for persistent storage.
Is there any way I can pipe the sam file to bam directly and do the data analysis in b/w. I don't want to store the sam file.