Default UI
Kimchi Jjigae
Step 3/8 · Sauté aromatics
Proximity
Triggered
Elbow-Bump Mode
Elbow-Bump Mode
Next Step →
Tap anywhere

An Adaptive UI for Hands-Free Cooking

When voice and gestures fail, the screen adapts to your elbow.

All product previews are confidential due to stealth mode
Scroll to explore the full case study
Interaction Designer & Prototyper Sensor Integration 8 weeks Cooking + IoT
Role
Interaction Designer
Prototyper & Researcher
Timeline
8 Weeks
Concept → Validated Prototype
Domain
Cooking + IoT
Accessibility · Adaptive UI
Tools
Figma · ProtoPie
TensorFlow Lite · Swift
-91%
Hand-wash interruptions
-78%
Step transition time
97%
Elbow-tap success rate
+34%
Recipe completion
~2 min read

Short on time?

Get the quick highlights in a visual, swipeable format.

Swipeable slides·4 key insights·ESC to close

The Problem

72% of cooking video users pause to wash hands before tapping "Next"

Recipe video apps assume clean, dry fingers. But in reality, users are elbow-deep in flour, raw meat, or wet ingredients. Every interaction requires a hand-wash → dry → tap → resume cycle that breaks cooking flow and adds 12-18 seconds per step transition.

Live Scenario
User tries voice
"Hey, next step"
Range hood interference
Ambient noise: 72dB — above 50dB voice threshold
Voice command failed
"Sorry, I didn't catch that. Try again?"
User falls back to manual
1. Stop cooking 2. Wash hands 3. Dry hands 4. Tap screen +15s wasted
0%
wash hands before tapping
0s
avg. wasted per step change
0%
want hands-free control

The Input Failure Cascade

Scenario: A cook is mid-recipe with flour-covered hands and a running range hood. Here's what happens when they try to advance to the next step.

Voice
"Hey, next step"
FAILED · 72dB noise
Gesture
Waves flour-covered hand
CONF: 34% · occluded
Proximity
Leans toward iPad
DETECTED · <30cm
Mode Switch
Elbow-Bump Mode
ACTIVE · 50% screen

App Analysis: Why Existing Solutions Fail

With 72% of users pausing to wash their hands while cooking, I audited top recipe apps (NYT Cooking, Tasty, etc.) to understand where their interaction models break down.

Touch Fails

Apps rely on precise micro-interactions — tiny arrows, swiping carousels. Hands coated in flour or oil make screens a physical barrier.

NYT Cooking Tasty Paprika
Voice Fails

Basic voice commands ("Next step") fail in real kitchens. Sizzling pans, running water, and blenders cause high error rates and frustration.

Siri Shortcuts Alexa Skills
The Multimodal Opportunity

The audit revealed a massive gap: users need to control the UI without clean hands or a silent room. This validated Cheffy's core premise.

Macro-Gestures + Voice Fallback + Adaptive UI

Investigation

Understanding hands-occupied interaction patterns

To understand when and why standard inputs fail during cooking, I combined three methods: in-home observation to capture real behavior, sensor data to quantify environmental barriers, and diary entries to surface moments users couldn't articulate in interviews.

Contextual Inquiry
12 home cooks observed in-kitchen over 3-hour cooking sessions
36 hours recorded
Sensor Data Analysis
Front camera proximity + ambient noise levels across cooking phases
2,400+ data points
Diary Study
7-day logging of "moments I wanted to tap but couldn't"
84 diary entries

Key Findings

Three patterns emerged consistently across all 12 participants. Voice commands failed most during the noisiest cooking phases — exactly when users needed hands-free control the most. Gesture recognition degraded dramatically with occluded or messy hands. But the most surprising finding was behavioral: most participants had already invented their own workaround.

🎤
0%
Voice failure rate
Range hood: 65-80dB
👋
<40%
Gesture confidence
Flour, mitts, utensils
💪
9/12
Tried elbow taps
But 44px targets too small

Kitchen Noise vs. Voice Recognition Threshold

Idle
35dB
Prep
45dB
Chop
55dB
Sauté
65dB
Hood On
72dB
Frying
80dB

Diagram & Strategy

Adaptive UI State Machine

A multi-signal trigger system that gracefully degrades from standard input to proximity-based macro interaction.

State Transition Diagram
TRIGGER CONDITIONS 🎤 Voice Input FAILED 👋 Gesture Confidence < 40% AND SENSOR INPUT 📷 Front Camera Proximity < 30cm ALL MET? YES NO → Stay Default STATE CHANGE Default UI Standard Controls Macro UI Elbow-Bump Mode Visual: Bottom 50% becomes giant semi-transparent "Next Step" button Transition: 400ms ease-out scale
Trigger (Failed Input)
Sensor Signal
State Change
Fallback (No Change)

Try It: Trigger the Mode Switch

Click each condition to simulate the multi-signal evaluation. All three must be active to trigger Elbow-Bump Mode.

🎤
Voice Input
Click to simulate failure
👋
Gesture Confidence
Click to drop below 40%
📷
Proximity Sensor
Click to detect < 30cm
Default UI Active
Recipe Step 3/8
Elbow-Bump Mode
Next Step →
Tap anywhere

Design Rationale

Four core design decisions shaped the final system. Each one was driven by a specific failure we observed in the field — the multi-signal gate came from false activations in early prototypes, the oversized target from measuring real elbow contact areas, the semi-transparent overlay from users needing to see the recipe beneath, and the auto-timeout from preventing accidental lock-in.

🔀
Multi-Signal AND-Gate
Single trigger
23% false positive
Multi-signal
<3% false positive
👆
Touch Target: 120x Larger
Fingertip
8-12mm
vs
50%
Elbow zone
40-60mm
👁
Semi-Transparent Overlay
Video visible
85% gradient → recipe still visible beneath button
Auto-Revert: 8s Timeout
Macro active 8s no proximity → revert
or double-tap to force exit

Validation

Testing with flour-covered hands

We ran a controlled usability study with 18 participants cooking actual recipes. Each participant completed the same 8-step recipe twice: once with standard UI, once with ProxiPlay adaptive mode.

Control: Standard UI
1
Standard 44px touch targets
2
Voice command only (no adaptive)
3
Must wash hands to tap screen
Test: ProxiPlay Adaptive
1
Multi-signal trigger detection
2
Auto Elbow-Bump Mode activation
3
50% screen macro touch target

Results

-91%
Hand-wash interruptions
From avg. 6.2 → 0.5 per recipe
-78%
Step transition time
From 15.2s → 3.3s avg.
97%
Elbow-tap success rate
On 50% screen macro button

I literally forgot I was using my elbow after the second step. It just felt natural—lean in, bump, done.

P7 · Home cook, 3 years experience

The big button appearing when I got close was like the phone reading my mind. I didn't have to think about how to interact.

P12 · Cooking instructor

Outcome

From concept to shipped feature

After three rounds of prototyping and the usability study with 18 participants, the adaptive mode was refined into a production-ready feature. The key transition — from standard controls to a half-screen elbow target — happens in under 400ms and feels invisible when it works well.

UI Transition: Default → Macro Mode

Default UI
Kimchi Jjigae
Step 3/8
Proximity
triggered
Elbow-Bump Mode
Elbow-Bump Mode
Next Step →
Tap anywhere

Scale Impact

Beyond the controlled study, we tracked engagement metrics after rolling out the adaptive mode to a 500-user beta group over 4 weeks. The data showed that removing the hand-wash friction didn't just save time — it fundamentally changed how people completed recipes.

Engagement Lift
Recipe completion rate+34%
Session duration+22%
Return rate (7-day)+41%
Technical Performance
Trigger accuracy97.2%
False activation rate2.8%
Mode switch latency< 400ms

Reflection

What I learned

Instead of adding more input modalities, the real breakthrough came from listening to what's already failing — and responding to that context.
Context is the best input — When voice fails and gestures fail, the system already knows the user needs help. Reading failure signals beats adding new modalities.
Body mechanics drive digital design — The 50% screen button only works because we measured real elbow contact areas (4.2 × 6.8cm) and forearm approach angles first.
Multi-signal > single threshold — Adding voice failure and gesture confidence as co-signals dropped false activations from 23% to under 3%. The AND-gate pattern is now my default.
Design Philosophy Shift

The conventional approach to input failure is to add more modalities — voice doesn't work? Add gesture. Gesture fails? Add gaze tracking. Each new channel adds complexity, edge cases, and cognitive load. ProxiPlay flipped this: instead of adding inputs, we read what's already failing and use that as the trigger to adapt the entire interface.

Old Approach
🎤
+
👋
+
👆
+
?
Add more input modalities

Stack voice, gesture, gaze, touch — hope one works. Each new layer adds latency, false positives, and user confusion.

New Approach
🎤
+
👋
⚙️
Listen to failures, then adapt

Failure signals are the input. Two failed modalities + proximity = the system already knows what to do. Zero new learning curve.

Where This Pattern Applies

Graceful degradation via multi-signal triggers isn't kitchen-specific. Any environment where users are physically separated from the interface — or where their hands, voice, or attention are compromised — benefits from the same AND-gate logic.

🍳
Kitchen Counter

Hands occupied, device 1-2ft away. Voice drowned out by 65-80dB range hood. Elbow is the only clean contact point.

Voice ✗ Touch ✗ Elbow ✓
📺
10-ft Living Room

Remote misplaced, voice misheard. Proximity of approach + failed voice triggers simplified on-screen controls — same AND-gate pattern.

Remote ✗ Voice ✗ Walk-up ✓
🏭
Industrial Floor

Heavy gloves block touch, machinery noise kills voice. Worker proximity + failed inputs triggers oversized panel controls.

Touch ✗ Voice ✗ Proximity ✓