I first heard “word embeddings” explained in an upper division computer science class at Stanford taught by a rock star computer scientist who lectured with his mountain bike propped against the far end of the stage. For me, it was a revelation in an underground auditorium. Two hundred graduate students, middle-aged tech workers, and undergrads seemed just as rapt. Here’s why.
Imagine every word lives somewhere on a map. If you place them just right, there’s a delicious logic that appears. Start at “King” and move northeast exactly 5.2 centimeters to get “man.” Then, instead, start at “Queen.” If you make the same move — northeast exactly 5.2 centimeters — you get to “woman.” It’s not just a one-time trick: Lots of relationships work like this. You can find capitals of countries, or CEOs from companies with good but imperfect accuracy. You just have to add the right distance in the right direction.
What I describe above is a violently simplified version of how a new school of computer scientists write code to “embed” words into columns of numbers a computer can understand. Linguists have been looking for this kind of structure for decades. Yet some work by “deep learning” specialists turned up these properties almost by accident. And the trick with the words is really just prelude to a longer process that might help computers make more human deductions.
Seeing the rise of “machine learning” techniques from the classroom isn’t just an amazing insight into possible software futures, it’s a window into how other disciplines are exploiting new technologies. A collaboration of Stanford academics from the computational journalism laboratory and computer science department just won a Knight Foundation grant to study 100 million highway patrol stops throughout the country. I was lucky enough to attend a class Prof. Justin Grimmer taught for the first time in 2016 on machine learning for social scientists; one class project had us predicting the results of the Iowa Caucus. A Stanford project called deepdive has brought machine learning within reach of law enforcement, who’ve used it to track down sex traffickers.
Learning firsthand the intricacies of a rapidly evolving technical field is invigorating — but also sometimes completely overwhelming. I’ve dealt with complete brain meltdown with trips to the climbing wall (Stanford has two) and long sweaty slogs over the hills. One day JSK Fellow Aaron Glantz and I rode our bikes almost 90 miles over the Santa Cruz Mountains, through the redwoods and back down into the increasingly acrid farmland of Corralitos, Freedom and Watsonville on into Monterey. But mostly I’ve jogged up and down the hills closer to home — on the Stanford campus, around the Dish Loop, on the hills on Matadero Creek Trail and around the southern ridges of Montara Mountain behind Half Moon Bay.
I added miles to my morning jogs in the spring, and it was hard not to see my muddy morning runs as a metaphor for afternoons I spent caffeinated, brain rumbling slowly along in front of my computer. By the time the Santa Cruz Marathon rolled around, I had a rule worked out for the race that I realized could just as well apply to those classes: Let the youngsters get there first. The important thing is to just keep going. It was a very long, wet race, but I beat my previous time by more than a minute.
Fenton was director of computer-assisted reporting at the Investigative Reporting Workshop at American University and then editorial engineer at The Sunlight Foundation, where he worked on campaign finance, television ad disclosure, and House and Senate expenditure reporting. He came to Stanford to work on building free and open-source tools to capture structured data from repetitive scanned-in forms.