Alex Kauffmann r+d | creative

  • Email:alex [at] this domain

  • Phone:+1.617.515.8838

  • LinkedIn
Google - Direct Objects @ ATAP
Technologies to Make Computers Disappear
Google - PhotoLab
Machine Learning and Mobile Photograpy
Google - Interaction R+D
Airbender
ACM MobiCom '20 Keynote
Research to Product: Stories from the Frontlines of R+D
Hardware consulting
Electronics for fun (and profit)
The ITP & IMA Journal of Emerging Media
Fakes, Deep and Otherwise
Historical Hacking
The Oracle of Random
The ITP & IMA Journal of Emerging Media
What's Wrong with VR
Samsung Mixed Talents
Brainstorming concussion sensing
Talking about making
Prototyping as idea generation
Sharing made simple
Google Tone
Virtual Reality for the masses
Google Cardboard
Audi Hackathon
Steering wheel beats
Talking about hand waving
I Hate Tom Cruise
First-person recruiter
Advance!
Paywalls
Value Propositions
Laugh Track
Sound of White Noise
Acute entropy
Black Hole Box
A very short film
Eggsistence
Private browsing
Censor Me
Multimedia
Scratch-n-Sniff Television
A language game that bites
Don't Bug Me!
Text munging for people
Al-gorithm
Playing with food
Mr. Meathead
  • Awards:

Google - Direct Objects @ ATAP

Technologies to Make Computers Disappear


ATAP logo

I lead Direct Objects, a hardware and software R+D team at Google's Advanced Technology and Projects group devoted to imbuing everyday objects with Google's smarts.

I work with a brilliant group of hardware and software engineers and prototypers to shape how people will interact with computers of the future by designing the technical infrastrure on which those computers will be built. Bare-metal user-centered design.


Our current research areas include:

  • > Low-power sensing
  • > Wireless power
  • > Radio backscatter
  • > EM tracking
  • > Device reuse
  • > Body area networks
  • > Alternatives to Bluetooth
  • > Gesture interfaces
  • > Ad-hoc recommendation systems
  • > Authentication
  • > Precise localization

(And I'm always looking for talented engineers, researchers, prototypers and designers to work with.)

A close-up image of the innards of an electronic device

Our hypothesis is that despite its theoretical appeal, ubiquitous computing and its various avatars have repeatedly failed in practice because they transformed simple objects into computers rather than making those computers disappear behind simple physical interfaces (much as electricity disappeared behind outlets and light switches).

Pullquote from a FastCo article about ATAP

I talked through some of our projects in a long Fast Company piece about ATAP (for some reason, I'm rendered speaking with a distinct valley girl vibe, but the ideas are all in there, superfluous "likes" notwithstanding).

There's obviously a lot more going on than I can talk about here, so get in touch, we frequently collaborate with external partners both on research and product development.


  • When: January 2018-present
  • File under: Hardware R+D Google

ACM MobiCom '20 Keynote

From Research To Product:
Stories from the Frontlines of R+D


Wouldn't you know that the year I get invited to give a keynote at MobiCom (slated to be held in London) all travel is cancelled because of the pandemic and I end up delivering it from my kitchen table.

In this session, I discuss four common non-technical obstacles that prevent a new technology from making it out of the research lab and into market and illustrate them with stories from ATAP.


  • When: September 2020
  • File under: Technology Transfer Talks

Fakes, Deep and Otherwise

Mechanical Reproduction in the Age of Algorithmic Generation


"Synthetic images produced by algorithmic techniques have become sufficiently difficult to differentiate from those captured by a camera that certain academics, congresspeople, and journalists have begun to, for lack of a better term, freak the fuck out."

I contributed this meditation on synthetic media to ITP's journal of emerging media after rereading Walter Benjamin's 1936 essay "The Work of Art in the Age of Mechanical Reproduction."

In the essay, Benjamin argues that mechanical processes like photography obliterate an image's specific physical existence so that it operates purely in the abstract plane of meaning. That, he worries, makes it particularly suceptible to being co-opted for political ends.

We are now at the point where learning algorithms can synthesize photorealistic images and videos by statistically modeling previous images, no longer requiring any mechanical capture or reproduction. Critics that have come to regard these mechanical processes as stand-ins for objectivity worry that reality itself is at risk. I'm not as worried.

Illustration by Chengtao Yi.


  • When: April 2019
  • File under: Criticism Machine Learning
Illustration showing photographic imagery overlayed on a sculptural metallic 3D blob

Google - PhotoLab

Machine Learning and Mobile Photograpy


What happens when it's as easy to generate a photorealistic image as it is to capture it?

To find out, I built an eight-person hybrid design and engineering team at Google Research to work alongside ML researchers and explore the possible photographic applications of on-device machine intelligence. We've built and launched a handful of "appsperiments" to give people outside of Google a taste of photography's potential futures and start a conversation about what they might mean.


Nat & Friends featured Storyboard and Selfissimo! in this episode about Google's photography apps


  • When: August 2016-January 2018
  • File under: Interaction Design Machine Learning Software Google

Storyboard is an Android app that transforms a video into infinite comics! It runs a bunch of ML models directly on the device to select video frames, crop, and stylize them. Fun fact: I drew the various seamlessly tiling hatch patterns by hand.


Selfissimo! (Android | iOS) is a stylish black and white AI photographer. Start a photoshoot and Selfissimo! takes a picture each time you pose. The app will offer occasional words of encouragment. Tap the screen when you're done and save the whole contact sheet or individual images.


Scrubbies is an iOS-only app that's kind of hard to explain, but it's a lovely interaction worth trying out. Remix your videos like a DJ scratches records.

Google - Interaction R+D

Airbender


Airbender was the advanced physical interaction R+D team I ran with Christian Plagemann and Boris Smus at Google. Though Airbender is sadly no longer, its mission of injecting novelty and whimsy into technical research remains central to all of my work.

Airbender started from the premise that computing doesn't take sufficient advantage of our bodies. We built and deployed technology that combined physical intuitions with clever uses of sensors to solve interaction problems elegantly and across platforms.

Though much of the work is not public, a number of projects have launched, including Google Tone, Cardboard, Expeditions, and PhotoLab.


  • When: 2012-2018
  • File under: Interaction Design Hardware Software Google

Historical Hacking

The Oracle of Random


I built the Oracle of Random for Howard Rheingold. The idea was to modernize one of the early "apps" on the WELL, where an email asking for a fortune would result in a three-word response pulled at random from a bag filled with words cut out of a dictionary.

When you visit the Oracle on one of Howard's psychedelic dodecahedral magic 8 balls, it displays (and speaks) cryptic verb-noun-adverb or verb-adjective-noun fortunes (I've reproduced it here, click the fortune to generate another). To make the Oracle's wisdom univerally accessible, I also signed it up for a Twitter account (tweet @OracleofRandom for a fortune).


  • When: May 2017
  • File under: Interaction Design Software Code

The ITP & IMA Journal of Emerging Media

What's Wrong with VR


"The first thing you need to know is that I am not a Virtual Reality (VR) enthusiast.... My dislike isn’t of VR per se, but stems rather from the messianic zeal with which big corporations, technologists, futurists, entrepreneurs, advertisers, venture capitalists, academics, journalists—pretty much everyone—have heralded its newest incarnation."

-The Emperor's New Headset: What's Wrong with VR

I contributed an essay on virtual reality to the inaugural issue of Adjacent, an online journal of emerging media from NYU's ITP.

My main critique of the (already flagging) enthusiasm for VR is that it considers 360-degree, 3D media an evolved version of 2D media. I don't think that's right. If I were building an evolutionary tree of media, I would trace film's descent from storytelling, while VR's progenitors are more experiential forms of expression such as music and dance.

In this essay, I argue that VR acts primarily on a limbic and subconscious level and is thus better suited to metaphor than to narrative. I offer practical approaches for moving beyond simulation and towards a deeper exploration of immersive media's unique potential. And I get in a jab or two.

Painting by Molly Lowe.


  • When: October 2017
  • File under: Criticism Virtual Reality
Detail from a painting by Molly Lowe of a woman wearing a VR headset while nursing a baby

Samsung Mixed Talents

Brainstorming concussion sensing


I was a guest sensing expert on an episode of Samsung's Australian Mixed Talents campaign to address concussions in sports.

In this season, an industrial designer and a neuroscientist team up to figure out the best way to sense potentially dangerous impacts. Braden and I spoke for about an hour, during which I suggested all sorts of crazy ideas. Most didn't make it into the final video (there was a long discussion on how much I detest Bluetooth), but you still get an overall brainstormy sense even though the conversation was conducted across continents and over video chat!


  • When: November 2015
  • File under: Sensors and sensing Talks
<

Hardware Consulting

Electronics for fun (and profit)


I don't get to do side projects nearly as often as I'd like. My favorite in recent memory was working with my friends Mike and Maaike on the electronics for a prototype piece of furniture with built-in speakers.

The sitter uses three large buttons in the side panel to navigate various tracks in three distinct sound banks. There are no controls other than the buttons. Pressing a button moves to the next track in the corresponding bank. Holding down any button stops the playback.


  • When: November 2019
  • File under: Hardware Interaction Design Side Gigs
Small project box with various development boards

I used a Robertsonics WAV Trigger to manage the audio via SD card and put all the button logic on a little Arduino clone.

Close-up of RCA jacks protruding from a plastic project box

RCA jacks made it easy to connect the existing button wiring in the furniture frame.

Project box installed and connected in the seat of the meditation pod

Here's the box installed under the seat. Speakers, separate volume controller, and electronics are all powered from a singe wall wart. All the guts fit perfectly inside a cream cheese container from the Container Store.

Talking about making

Prototyping as idea generation


One:one:one is an ideation methodology my team at Google Research developed to play to the strengths of engineers and builders for whom traditional "design thinking" doesn't really work.

This is a talk I gave about it at Designers + Geeks. One:one:one played a big role in the creation of Google Tone and Cardboard.

[My talk starts at 20:30]

  • When: August 2015
  • File under: Talks Prototyping

Sharing made simple

Google Tone


Google Tone is a Chrome extension that transmits the URL of the current browser tab to computers within earshot over audio.

When you press the blue megaphone button in your browser bar, the computer's speakers emit a combination of audible and inaudible sounds that any nearby machine with a microphone can pick up. The signal triggers a Chrome notification. Clicking it opens the transmitted URL in a new browser tab.

The idea was to create a digital interaction that resembled talking—you don't have to know a person's name to address him, you can control how much you're overheard by modulating the volume of your voice, and you can address a crowd as easily as a lone interlocutor.

You can read more about it on the Google Research blog.


  • When: May 2015
  • File under: Product Design Interaction Design Google

Virtual Reality for the masses

Google Cardboard


Google Cardboard is an award-winning, inexpensive virtual reality viewer for smartphones.

I led the design of all aspects of the first version of Cardboard (viewer, branding, graphics, app look and feel) and the "cardware" design for the much improved second version a year later. The initial run was 10,000 units. There are now more than 20 million viewers out there.

If you'd like one, you can buy it here.


  • When: June 2014-March 2016
  • File under: Virtual reality Visual design Cardware Google
Photograph of a stack of v1 Cardboards Step-by-step isometric illustration of setting up the v2 Cardboard viewer

Audi Hackathon

Steering wheel beats


Have you ever drummed on your car's steering wheel when you're stopped at a light?

I do it all the time, so at a weekend hackathon for Audi at the Sonoma Raceway, I wanted to enrich that interaction.

Over the course of about 45 minutes, Mitch Heinrich and I rigged up a Makey Makey, six pieces of copper tape, and a drum machine to the steering wheel. I talked with the engineers about building a little sequencer directly into the car radio—possible, but they estimated it would take about 7 years to make it through testing.


  • When: November 2013
  • File under: Physical Computing Interaction Design Prototyping Side Gigs

There are pads on both front and back sides of the steering wheel, so you can alternate easily between kick, snare, and hi-hat. The toms are at the bottom, and the crash cymbal is on the dash.

Talking about hand waving

I Hate Tom Cruise


I gave this Ignite talk about the folly of Minority Report-style gestural interfaces at Foo Camp '13.


  • When: June 2013
  • File under: Talks Gestural Interfaces

Computers with accents

Text-to-English-as-a-Second-Language


Black and white illustration of an old rotary dial telephone

Call 617.286.5064 and follow the prompts to hear the various accented computer voices.

(They are limited to the languages I can confidently transcribe in and for which I had access to TTS voices; some are better than others.)


  • When: December 2010
  • File under: Telephony Code

I have yet to hear a text-to-speech voice in English that doesn’t sound alienatingly mechanical. Text-to-English-as-a-Second-Language uses one form of speech distortion (an accent) to cover another (a speech synthesizer). I’ve heard computers speak very convincing Mandarin, and, possibly to spite the rest of the world, the French seem to have taught computers to speak their native tongue flawlessly. But English-speaking computers still sound like 1950s movie robots.

I wrote a script that takes English words and transcribes them phonetically into other languages—in Spanish for instance, hello becomes jelo.

Here's a full example:

Jelo. Mai nem is Inigo Montoya. Llu kild mai fáder, pripeer tu dai.

I tried various accents to mask the mechanical intonation. Spanish works OK, French works pretty well, Italian is great, Dutch just sounds weird. The voices live on an Asterisk server hosted by the fab folks at Tropo. Call and have a listen.

First-person recruiter

Advance!


Advance! is a sneaky little game about systemic biases in corporate settings masquerading as a resource allocation game à la Diner Dash. You run a job placement agency and have a neverending (and accelerating) queue of jobseekers who need to be placed in corporate jobs where they will thrive (and get you paid). Unfortunately, some candidates do better than others based on characteristics other than their professional merits.

I refined the mechanics and created the visual style for ADVANCE! over a couple of months. Much of my work developing the interface was iterative simplification to maintain the game’s overall information-dense statistical feel while ensuring players’ ability easily and intuitively make multi-dimensional decisions such as comparing a candidate’s qualifications with job requirements while previewing that candidate’s relationship with his/her potential colleagues.

Advance! was designed as a study tool for Jessica Hammer's doctoral dissertation.


  • When: September 2010
  • File under: Interaction Design Game Design Visual Design
Image of isometric gameboard

In Advance!, the player runs a job agency responsible for staffing a faceless but multi-ethnic corporation in a boring high-rise office building. As applicants enter the job queue, the player tries to find appropriate job openings for his clients. Each job requires certain minimum qualifications. If the player doesn’t fill a job quickly, the company will hire internally.

Image of isometric gameboard

The abstract notion of a score is replaced with a running tally of the job agency’s bank account balance. Running the business costs money, so this balance creeps downward throughout the game. Successfully placing job candidates results in cash bonuses, and the player receives a percentage of their salary as long as they’re employed. Training candidates improves their job prospects as well as the bounty a player receives for placing them, but as a character becomes more experienced, the cost of training grows exponentially.

Image of isometric gameboard

Each job also comes with its own politics, symbolized on the game board by hearts and skulls beneath the colleagues that come with a particular job. The higher the ratio of hearts to skulls, the more likely a candidate will thrive in a position and become eligible for promotion.

Image of isometric gameboard

Promotions (and demotions) happen automatically when a job opens on a floor above a certain character’s current floor and his/her qualifications have increased sufficiently. Alternately, the player may choose to train characters to manually enhance their skills, though this can prove very expensive.

Paywalls

Value Propositions


ThePaywall.com documents the ideas in Value Propositions, my Masters thesis at ITP.

Its main contention is that infinite supply means that the perceived value of online content is proportional to some measure of how much it’s needed at a given moment and inversely proportional to its cost. Which is a fancy way of saying that free content feels a whole lot more valuable than the same content when it costs just a penny, and I argue that this feeling of value is intimately tied up with the nature of the interaction required to access the content.

You’re probably shaking your head. It’s a nebulous, postmodern argument, I agree, which is why I built a number of paywalls that put the argument into the realm of experience. And because there are two groups in the paywall debate—those protected by the paywall and those impeded by it—I sought to allow each group to experience the other’s position. All the walls are intended to interfere with a normal consumption of web content. They’re implemented in a variety of ways and arranged as a series of obstacles that the user must overcome to reach the ArrayWall (pictured above), and hopefully a visceral sense of what constitutes worth online.


  • When: May 2010
  • File under: Software Hardware Telephony ITP
Paywall (n.) — An ominous sounding term for a webpage that requires a user to pay to access it. Paywalls are most commonly implemented on sites that offer curated, professionally produced content—newspapers such as the Wall Street Journal or LexisNexis, for instance.
Description of all the different paywalls that made up the project

Hear Brooke Gladstone's skeptical assessment on NPR's On the Media at minute 17:10!

Laugh Track

Sound of White Noise


Sound of White Noise is Obama's first State of the Union address, all 69 minutes of it, reduced it to its negative space, (nearly) all its utterances excised. What remains is a strangely compelling silent dance between the beats. How we don’t speak is as idiosyncratic as how we do.

The rhythm of Obama's silences follows the rhythm of his speech. Even without hearing a word, we can tell by watching the crowd how his rhetoric moves from introduction to exhortation and a final appeal to patriotism. The accompanying silence in the room is mesmerizing, both for its depth and its duration. Here is a man who can hold an audience for over an hour, and, I’d argue, comes pretty close to holding an audience for half an hour without saying a damn thing.


  • When: April 2010
  • File under: Video ITP

Acute entropy

Black Hole Box


Black Hole Box is a featureless enclosure that consumes a AA battery by continually checking the remaining charge.

When it drops below a threshold, the onboard microprocessor connects to the internet and orders another battery online. The battery, which it orders from Amazon, arrives within four days and the Black Hole Box’s owner must replace it in order for it to continue consuming and re-ordering batteries. I built this during a particularly dark time.


  • When: February 2010
  • File under: Hardware ITP
Photograph of a nondescript black plastic box with an ethernet port

A very short film

Eggsistence


Eggsistence is a pre-Vine video response to Hemingway’s terse but complete “For Sale: Baby shoes, never worn,” a micro-cinematic three-shot story.

I composed and shot a story about putting an egg carton with only two eggs left in it in the fridge and opening the next morning to find the two eggs snuggling cozily in adjacent spaces, surrounded by half a dozen quail eggs.

Then I read Robert Bresson’s Notes sur le Cinématographe, and it made me think that maybe I shouldn’t force the egg to tell a story but rather to poortray the egg's intrinsic narrative.


  • When: February 2010
  • File under: Video ITP

Private browsing

Censor Me


Censor Me was a Flash app that used OpenCV to superimpose a black bar over any eyes it found in the frame. It also emitted a loud beep any time it detected sound. No need to wait for someone else to object to something you’ve said!

Try it here (you'll have to give the browser permission to access your camera and mic) or watch a video.

After Flash was finally deprecated in Chrome at the end of 2020, I spent a couple of hours rewriting it in Javascript. The face detector isn't quite as robust as the one I was using previously, but it's still pretty good, especially running directly in the browser.

Unlike the original Flash version, the Javascript version actually works on mobile devices, but to preserve the flavor of the original, I've disabled it. If you're seeeing this message on a desktop browser, try increasing the size of your browser window.

I wanted to include voice activity detection so that the beeping would only be triggered by speech (rather than by sound volume) but the Javascript speech recognition API only works on Chrome and sends your audio to a server, which I don't want to do, and none of the lightweight options actually differentiate between speech and other continuous sounds.


  • When: January 2010
  • File under: Computer Vision Ex-Flash

Multimedia

Scratch-n-Sniff Television


Scratch-n-Sniff TV is exactly what it says: a television that behaves just like a scratch-n-sniff sticker.

Vigorously scratch a baseball player on the screen, lean in, and smell—it smells like sweat! Scratch the grass and it smells like grass!

Scratch-n-Sniff TV relies on clever interaction design, laminar flow, a microcontroller running custom firmware, a touchscreen monitor, a bunch of hacked air freshener dispensers, and some software to tie it all together.

I wrote a feature on it for Make Magazine. The print version looked nicer.


  • When: December 2009
  • File under: Hardware Interaction Design ITP

EXHIBITIONS

Maker Faire
2010. New York, NY
Natural Selections
2010. Greylock Arts, North Adams, MA
ITP Winter Show
2009. New York, NY
Wiring schematic for the scent dispensers

A language game that bites

Don't Bug Me!


A disembodied voice tells you where to bite the characters. If you bite the right spot, the mosquito feeds and refuels for his next flight. If you bite the wrong spot, the mosquito is grotesquely swatted.

Don't Bug Me was playable on this page until late 2020 when Flash was sunsetted and it sadly went the way of hundreds of other little Flash games. There's no practical way of porting it to run on the web without rewriting it from scratch, so I captured video of its most salient features.

One summer while I was in grad school, I worked in Tokyo at a division of TBS, where I developed Don't Bug Me!, a prototype for an English listening comprehension game for Japanese kids.

I spent a month conceiving the game, laying it out, developing the code, and art directing the incomparable Nina Widartawan-Spenceley, who created the characters and animated the gory death sequences.

The game was written entirely in ActionScript and is designed in a modular fashion to allow a non-coding designer to easily switch out the background image, bite targets, and audio by simply editing a text file.


  • When: Summer 2009
  • File under: Game Design Code


screenshot from the game

A boy and a girl wave as a voice tells you where to bite them.


The game screen scrolls to reveal the children's full bodies and their unsuspecting dog, also a target of your unquenchable bloodthirst

screenshot from the game

Bites leave uncomfortable looking bumps.


A misunderstood possessive pronoun can end in a blood bath.

Text munging for people

Al-gorithm


Al-gorithm is an analog text-munging program written to be executed by a person. It has three outputs.

The first is an interactive version of a passage from All the King’s Men created by cutting out all but an individual alphanumeric character from identical printouts of the quote. Writing text-munging algorithms is relatively painless, so much so that it’s easy to forget the text entirely. I wanted to rediscover text munging from the algorithm’s perspective, so I became the algorithm.

The second is a bag filled with the discarded bits of text. In digital space, memory registers that held the initial text can be overwritten. Clearing memory in the analog world is a little more complicated.

The third is a visualization emphasizing both the physical origin of text and also its arbitrariness. The cutout for each letter produces a unique pattern. I recursively generated the text by substituting the patterns of each letter’s frequency for the letter within the original text (recursively makes it sound like it was done programmatically—I actually manually created the pattern files in Illustrator and then used them to create a font).


  • When: Spring 2009
  • File under: Computational Performance ITP
Photograph of installation, including a hanging bag of cut out words and letters Photograph of multi-layered cut-out text on stacked pages Photograph of multi-layered cut-out text on stacked pages Text regenerated recursively using the unique physical pattern of produced for each letter The letter i rendered in this recursive manner

Playing with food

Mr. Meathead


Click and drag facial features to give Mr. Meathead a face. The fun really starts when you stack multiple features.

Mr. Meathead represents my infatuation with Flash—iPad or no, it remains one of my favorite graphical prototyping tools—I only wish I had discovered it when I still liked playing with my food.

I've disabled the game on mobile devices because that's what Flash would have done. If you're seeing this message on a desktop machine, try increasing the size of your browser window.

Sadly, Flash has gone the way of my youth, and Javascript (and gray hair) are poor substitutes. I've done what I can to port the original experience, though some of the mouse events are a bit wonky.

Still, it's better than the "Flash is no longer supported" message.

screenshot of Flash no longer supported error message

And just like that, I was deprecated.


  • When: February 2009
  • File under: Ex-Flash Visual Design
Copyright © 2009-