Daniel Marsh-Patrick
2 min readSep 9, 2019

--

Hi Jaydeep, and thanks!

The sampling frequency resolution (in conjunction with bandwidth) will either expose or smooth out features in your data as specified. A higher resolution in conjunction with a lower bandwidth will effectively trend towards making most of the individual data points visible within the KDE plot.

This is unpacked a little bit more in the visual documentation, under the Tuning Your Violin heading, in conjunction with the KDE kernel types. The wikipedia article linked in the doc (plus their page on kernel density estimation, which the previous link can also be reached from) is a good place to start if you want to understand more about the parameters that go into KDE and their kernel functions. I did most of the initial work on the visual from these articles.

For the most part, you’re likely to see negligible difference between the kernel function results and this will depend on data distribution/modality . I included these other functions as most other plotting libraries just use the Gaussian kernel — which is a great general-purpose function — and I wanted to provide anyone wishing to work with other kernels the option of doing so (plus it being a personal challenge to learn to implement them in JS).

Regarding adding a histogram plot — I have considered it but I’m not looking to at this time, given that the KDE plot is effectively a smooth histogram. I also develop this visual in my spare time and am a little short on that at present, so my current focus is on fixing any bugs that come in; fortunately the visual is pretty stable, although I’ve probably jinxed it now :P

I will, however, add an issue to my backlog and promise to revisit if/when I make another update in future.

--

--

Daniel Marsh-Patrick
Daniel Marsh-Patrick

Written by Daniel Marsh-Patrick

Full-stack developer and BI afficianado, based in Auckland, NZ| I seem to enjoy writing about Power BI a lot | @the_d_mp

Responses (1)