Introducing scatterplot.online

In this article I would like to introduce scatterplot.online which is a tool for quickly creating scatter plots from data in the browser.

There are a lot of data visualization tools out there. Most of them allow you to create scatter plots. Some call them bubble charts, some are simple and some are more complex. You can find a good comparison of 12 such web apps in this blog post.

So how is scatterplot.online different? It aims to provide a really fast way to create a scatter diagram. Page should load fast. No account, no login required. Just drag and drop csv file or copy paste its content into the page and you should see the first version of your diagram in a matter of seconds. Your data will never leave your computer. All will be happening in your browser and it will even work without the internet which means you can create scatter plots while on the road and I know that every geek out there has always wanted to do that.

The rest of this article will present examples of scatter plots that were created with scatterplot.online. Each plot is a link to an editable project.

Custom shapes

If you took any machine learning course there is a good chance you are familiar with the Iris data set. It has 50 samples of three different species of Iris flowers and their measurements. Let's represent the species using different shapes and colors.

iris data set

Usually, using complicated shapes will do more harm than good but I think if used carefully it could be beneficial. It all depends on the context and the purpose.

Emoji

With scatterplot.online Unicode symbols can be used as points on the diagram. Emojis are Unicode symbols. Although fun, plotting 10 000 owls or burgers is probably not the best idea. However, I think there are certain situations where using emojis could improve the message the plot is trying to convey.

One example could be a scatter plot with nutrition values of fruits. There is only one point for each category and therefore in my opinion the cognitive effort required to read this one is smaller than if the points were simply colored circles with a legend on the side. I cannot support that hypothesis with any research though. And it is not new, people have done it already (fashion, fruits or pets). Raw data for the plot below.

iris data set

Serious Scientific Stuff

Ok, back to more standard scatter plots. Logarithmic scales are often useful when your data is distributed in a certain way. First example here is the data set provided by Gapminder. This is the scatter plot that was used to compare 12 different scatter plot visualization apps in the blog post mentioned in the second paragraph. It compares GDP and life expectancy across 187 different countries where the size of each bubble represents the population of the country.

the gapminder data set

Another example is power law distribution. Scatter plot below shows the frequency of occurrence of unique words in the novel Moby Dick by Herman Melville. Find out more about the data and power laws. There are almost 19 000 points plotted here and it is still quite comfortable to work with in terms of performance.

power law distribution

Technology

As far as technology is concerned, scatterplot.online uses SVG to draw the visuals. D3.js library is used to facilitate that and Papa Parse is used to help parsing data into formats used internally. Everything is written in JavaScript and makes use of the language newest features and it is not transpiled into ECMAScript 5.

As a result of choices mentioned above there are some limitations. It may not work in all the browsers or old browsers. For new JavaScript features to work it often requires your browser to be up to date. I am trying to test and make sure it works in Chrome, Firefox and Safari. Oh, and for the moment being, scatterplot.online is not accessible on small screens. Another limitation is size of the data. All the work is done in the browser on your machine and therefore it will depend on both those factors.

The whole software life cycle is managed on gitlab.com and I must say they did a great job providing everything a web developer may need.

Future

I am planning to update this article as new features are released. And there is a lot in the roadmap: supporting date time data type, various UI improvements, more sharing and export options, tool tips, hexagonal binning, 3D, best fit lines, more style customizations and predefined themes and layouts and of course performance and bug fixes.

In the meantime, I would love to hear your feedback. Please share your thoughts (and scatter plots) or see my other data viz projects.