When using bwa, I usually do the following:
bwa aln -t 4 genome_index reads.txt > reads.txt.bwa
bwa samse genome_index reads.txt.bwa reads.txt > reads.sam
So, everything is default. I would like to take the step of making more customized decisions. However, I need some heuristics for making my choice or at least some criteria that I should be thinking about when deciding which parameters to use.
If this question is too broad, perhaps we can just concentrate on the -n option. I think by default it is 0.04; How do I decide what value to set this to?
Edit: My downstream analysis would be ChIP-seq analysis or potentially DNase-seq.
Hi, if it is just about the -n parameter, you can have a look at this question: http://www.biostars.org/post/show/16221/what-does-bwas-n-parameter-mean/ If you have constant read length, you can set it to an integer value, defining the number of mismatches you are willing to allow. However, you might still have variable length due to clipping In any case the choice of parameters depends on the downstream analysis you wish to undertake (variant calling, DE analysis, etc.) and the features of your input data. If you could provide a little more information about your application that might help a lot in finding a good answer.
Thanks for the question. I modified it to say I am interested in ChIP-seq and DNase-seq.