My understanding of the choice between soft-clipping and hard-clipping is that hard-clipping is applied when the clipped bases align elsewhere in the reference genome, i.e chimeric reads. At least in bwa this appears to be when hard clipping is used. I'm not sure about other aligners?
bwa-mem 0.7.5 release notes from http://seqanswers.com/forums/showthread.php?t=31237:
Changed the way a chimeric alignment is reported (conforming to the upcoming
SAM spec v1.5). With 0.7.5, if the read has a chimeric alignment, the paired
or the top hit uses soft clipping and is marked with neither 0x800 nor 0x100
bits. All the other hits part of the chimeric alignment will use hard
clipping and be marked with 0x800 if option "-M" is not in use, or marked
with 0x100 otherwise.
As an example, here's part of a bam file with a read pair containing a chimeric read. The top hit is soft clipped and the second top hit is hard clipped and marked as secondary by BWA (
20692128 97 viral_genome 21417 60 69M32S chr7 101141242 0 TACATCTTCTCCCTCTCTCACGACACAAGAATTAGTCACATAGGGATGTTCTCGTAAATCTACATTATCTTACAAAAACATTTTTTAAAAATTTGCTAGGT GGGGGGGGGGGGGGEGGEGGGGGGGGGFGGGGGGGGGGGGGEGFFGGGGGGGFGGFGGGGEGGGGGGGGGGGEGEFFGGGFEGGGGGFGCGGGFBGGGBG@ NM:i:4 MD:Z:6G34G6C5C14 AS:i:49 XS:i:0 SA:Z:chr7,101141091,+,66S35M,60,0;
20692128 353 chr7 101141091 60 66H35M = 101141242 252 ATCTTACAAAAACATTTTTTAAAAATTTGCTAGGT GGGGGGEGEFFGGGFEGGGGGFGCGGGFBGGGBG@ NM:i:0 MD:Z:35 AS:i:35 XS:i:23 SA:Z:gi|224020395|ref|NC_001664.2|,21417,+,69M32S,60,4;
20692128 145 chr7 101141242 60 101M gi|224020395|ref|NC_001664.2| 21417 0 GCAACAGAGCGAGACCCTATATTCATGAGTGTTGCAATGAGCCAAGTAGTGGAGGTTGGCTTTTGAAGGCAGAAAAGGACTGAGAAAAGCTAACACAGAGA FEGCGGGGGCGEFCDEEEEGGGGGGGGGGGGGGGEGGGGGGFGGGEGGG
'*' as the cigar string means the read is not aligned so there is no way to show it relative to the reference.