Dataviz of the week, 10/5/17

Font Map is an interactive website by designers Ideo which aims to represent typefaces in 2 dimensions so you can eyeball similar ones. They make a big deal out of “leveraging AI and convolutional neural networks to draw higher-vision pattern recognition”. I’m not sure what that sentence means, though I conclude they got a thrill out of it. (I refer to the opaque boardroom talk; I know perfectly well what these techniques are.) What we see on the screen is a classic horseshoe shape of dimension reduction that happens when you have an underlying continuum that mostly lies along one axis. You see this with principal components analysis, multiple correspondence analysis, multidimensional scaling, whatever. t-SNE screws around with it (read: anisotropically transforms the projected space) to straighten out that hoof.

Screen Shot 2017-05-09 at 13.45.14

On this basis, we seem to have one overarching scale from italic to bold. That’s not much of a breakthrough, and although there certainly is merit in a list of similar fonts, you don’t need a whizzy graphic for it. It would also be better done by humans, as some of the fonts are misplaced to my eye. But that’s CNNs for ya; I’d also like some exploration of what features are detected. In a blog post, Ideo’s project lead Kevin Ho explains the method. I don’t know to what extent the number of training images mattered, but that is something to think about if you are doing this sort of thing. Then there’s an image of “early results” through t-SNE that, to my mind, looks better than the final results, because more clusters emerge that way. It’s not clear how he then got to the final result, though it looks like maybe he just spared the t-SNE special sauce, or took the k-D (k>2) projection and then smacked it down further through PCA (ML people love PCA, they think it has magical powers). I don’t know. (You should check out this page on t-SNE, once you understand the principle, by those ninjas of interactivity Viegas & Wattenberg, plus Ian Johnson of Google Cloud).

All in all, you know, it’s fun, and it’s important to experiment (as my grandad said about tasting his own urine), but if you talk up the AI angle too much, people who know about it will start to doubt the quality of your work. That’s a pity but it can be guarded against by providing lots of details of your method and viewing it as an ongoing exploration, not a done deal. I say this as advice to young people, not criticism of Kevin Ho’s work because I just don’t know what he did.

Advertisements

Leave a comment

Filed under machine learning, Visualization

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s