I have a sam file something like this
SRR959756.117 272 RNU6-170P 141 9 23S21M * 0 0 TAGAGACGGGGTCTCGCTATGTTGCTCAGGCTGGAGTGCAGTGG SRR959756.117 272 RNU6-171P 139 9 24S20M * 0 0 TAGAGACGGGGTCTCGCTATGTTGCTCAGGCTGGAGTGCAGTGG SRR959756.1740 272 RNU2-65P 349 14 2S23M19S * 0 0 CAGCTGGGATTACAGGCATGAGCCACCACGCCTGGCACCCAGCT SRR959756.117 16 RNU6-1295P 110 9 19S25M * 0 0 TAGAGACGGGGTCTCGCTATGTTGCTCAGGCTGGAGTGCAGTGG SRR959756.212 256 RNU2-41P 38 1 25M19S * 0 0 ATCTGTTCTTATCAGTTTAATATCTGATACGTCCTCTATCCGAG
(I am only showing the first 10 columns of it). I want to extract the soft clipped sequence and make a fasta file with header as >column1_column2_column3_column6. How can I do this. I am a little confused with the strand/flag column and hence having difficulty interpreting how to extract the soft clipped seq. Also where the CIGAR STRING is 2S23M19S (as in 3rd case), I just want to extract the 19S part of the seq. How can I do this.
Hope to hear soon