As discussed in our original Spark Summit presentation: See [22 min mark](https://youtu.be/LTeZoo6kEBQ?t=1319). _Listening to myself is awful btw._ Inspired by the nice visualization provided by [Facets Overview](https://pair-code.github.io/facets/) while leveraging spark to handle large distributed data sets.