Triggered
ProxiPlay: An Adaptive UI for Hands-Occupied Cooking
When voice and gestures fail, the screen adapts to your elbow.
Prototyper & Researcher
Concept → Validated Prototype
Accessibility · Adaptive UI
TensorFlow Lite · Swift
Short on time?
Get the quick highlights in a visual, swipeable format.
The Problem
72% of cooking video users pause to wash hands before tapping "Next"
Recipe video apps assume clean, dry fingers. But in reality, users are elbow-deep in flour, raw meat, or wet ingredients. Every interaction requires a hand-wash → dry → tap → resume cycle that breaks cooking flow and adds 12-18 seconds per step transition.
The Input Failure Cascade
Scenario: A cook is mid-recipe with flour-covered hands and a running range hood. Here's what happens when they try to advance to the next step.
Investigation
Understanding hands-occupied interaction patterns
To understand when and why standard inputs fail during cooking, I combined three methods: in-home observation to capture real behavior, sensor data to quantify environmental barriers, and diary entries to surface moments users couldn't articulate in interviews.
Key Findings
Three patterns emerged consistently across all 12 participants. Voice commands failed most during the noisiest cooking phases — exactly when users needed hands-free control the most. Gesture recognition degraded dramatically with occluded or messy hands. But the most surprising finding was behavioral: most participants had already invented their own workaround.
Kitchen Noise vs. Voice Recognition Threshold
Diagram & Strategy
Adaptive UI State Machine
A multi-signal trigger system that gracefully degrades from standard input to proximity-based macro interaction.
Try It: Trigger the Mode Switch
Click each condition to simulate the multi-signal evaluation. All three must be active to trigger Elbow-Bump Mode.
Design Rationale
Four core design decisions shaped the final system. Each one was driven by a specific failure we observed in the field — the multi-signal gate came from false activations in early prototypes, the oversized target from measuring real elbow contact areas, the semi-transparent overlay from users needing to see the recipe beneath, and the auto-timeout from preventing accidental lock-in.
8-12mm
40-60mm
Validation
Testing with flour-covered hands
We ran a controlled usability study with 18 participants cooking actual recipes. Each participant completed the same 8-step recipe twice: once with standard UI, once with ProxiPlay adaptive mode.
Results
I literally forgot I was using my elbow after the second step. It just felt natural—lean in, bump, done.
The big button appearing when I got close was like the phone reading my mind. I didn't have to think about how to interact.
Outcome
From concept to shipped feature
After three rounds of prototyping and the usability study with 18 participants, the adaptive mode was refined into a production-ready feature. The key transition — from standard controls to a half-screen elbow target — happens in under 400ms and feels invisible when it works well.
UI Transition: Default → Macro Mode
triggered
Scale Impact
Beyond the controlled study, we tracked engagement metrics after rolling out the adaptive mode to a 500-user beta group over 4 weeks. The data showed that removing the hand-wash friction didn't just save time — it fundamentally changed how people completed recipes.
Reflection
What I learned
Instead of adding more input modalities, the real breakthrough came from listening to what's already failing — and responding to that context.
The conventional approach to input failure is to add more modalities — voice doesn't work? Add gesture. Gesture fails? Add gaze tracking. Each new channel adds complexity, edge cases, and cognitive load. ProxiPlay flipped this: instead of adding inputs, we read what's already failing and use that as the trigger to adapt the entire interface.
Stack voice, gesture, gaze, touch — hope one works. Each new layer adds latency, false positives, and user confusion.
Failure signals are the input. Two failed modalities + proximity = the system already knows what to do. Zero new learning curve.
Graceful degradation via multi-signal triggers isn't kitchen-specific. Any environment where users are physically separated from the interface — or where their hands, voice, or attention are compromised — benefits from the same AND-gate logic.
Hands occupied, device 1-2ft away. Voice drowned out by 65-80dB range hood. Elbow is the only clean contact point.
Remote misplaced, voice misheard. Proximity of approach + failed voice triggers simplified on-screen controls — same AND-gate pattern.
Heavy gloves block touch, machinery noise kills voice. Worker proximity + failed inputs triggers oversized panel controls.