interpolation y vector
6.3 years ago
Hi, I have a data set x and y, where x is somewhere on a contiguous line. I do not have y- values for every integer value of x. I would like to assign 0 to every x integer that does not have a y-value. Is there a tool or command in bash or matlab to automate this?

Specifically, I have a list of chromosome positions with sam flag values for each base position. Because I am only looking for specific samflags, I do not have values for every base position. I want to plot values for every base position, to then get a smoothed average, so I need there to be 0's for the position that there are no samflag values. If there is a way to do this in samtools also, that would work. I tried mpileup, but you do not get samflag outputs (my output also doesn't match the man documentation for mpileup).



6.3 years ago

In R, the approx() function will do what you want. In matlab, just see the matlab documentation for interpolation.

More generally, I have no idea what you mean by "samflags". My only guess is that you have a SAM/BAM/CRAM file and you're interested in the flag field of the alignments that start at or cover each position. Whether you want to replace "NA" with 0 is situation dependent, but I would guess that this isn't what you really want to do.

Samtools mpileup never outputs the flag field of an alignment. Yes, the orientation of an alignment can be conveyed, but that's it. If you need anything other than that then just script something in pysam (or use the HTSlib C API) using the internal pileup functions. You can then do whatever you want.


