Default UI

Kimchi Jjigae

Step 3/8 · Sauté aromatics

Proximity
Triggered

Elbow-Bump Mode

Next Step →

Tap anywhere

An Adaptive UI for Hands-Free Cooking

When voice and gestures fail, the screen adapts to your elbow.

All product previews are confidential due to stealth mode

Scroll to explore the full case study

Interaction Designer & Prototyper Sensor Integration 8 weeks Cooking + IoT

Role

Interaction Designer
Prototyper & Researcher

Timeline

8 Weeks
Concept → Validated Prototype

Domain

Cooking + IoT
Accessibility · Adaptive UI

Tools

Figma · ProtoPie
TensorFlow Lite · Swift

-91%

Hand-wash interruptions

-78%

Step transition time

97%

Elbow-tap success rate

+34%

Recipe completion

~2 min read

Short on time?

Get the quick highlights in a visual, swipeable format.

Swipeable slides·4 key insights·ESC to close

The Problem

72% of cooking video users pause to wash hands before tapping "Next"

Recipe video apps assume clean, dry fingers. But in reality, users are elbow-deep in flour, raw meat, or wet ingredients. Every interaction requires a hand-wash → dry → tap → resume cycle that breaks cooking flow and adds 12-18 seconds per step transition.

Live Scenario

User tries voice

"Hey, next step"

Range hood interference

Ambient noise: 72dB — above 50dB voice threshold

Voice command failed

"Sorry, I didn't catch that. Try again?"

User falls back to manual

1. Stop cooking 2. Wash hands 3. Dry hands 4. Tap screen +15s wasted

wash hands before tapping

avg. wasted per step change

want hands-free control

The Input Failure Cascade

Scenario: A cook is mid-recipe with flour-covered hands and a running range hood. Here's what happens when they try to advance to the next step.

Voice

"Hey, next step"

FAILED · 72dB noise

Gesture

Waves flour-covered hand

CONF: 34% · occluded

Proximity

Leans toward iPad

DETECTED · <30cm

Mode Switch

Elbow-Bump Mode

ACTIVE · 50% screen

App Analysis: Why Existing Solutions Fail

With 72% of users pausing to wash their hands while cooking, I audited top recipe apps (NYT Cooking, Tasty, etc.) to understand where their interaction models break down.

Touch Fails

Apps rely on precise micro-interactions — tiny arrows, swiping carousels. Hands coated in flour or oil make screens a physical barrier.

NYT Cooking Tasty Paprika

Voice Fails

Basic voice commands ("Next step") fail in real kitchens. Sizzling pans, running water, and blenders cause high error rates and frustration.

Siri Shortcuts Alexa Skills

The Multimodal Opportunity

The audit revealed a massive gap: users need to control the UI without clean hands or a silent room. This validated Cheffy's core premise.

Macro-Gestures + Voice Fallback + Adaptive UI

Investigation

Understanding hands-occupied interaction patterns

To understand when and why standard inputs fail during cooking, I combined three methods: in-home observation to capture real behavior, sensor data to quantify environmental barriers, and diary entries to surface moments users couldn't articulate in interviews.

Contextual Inquiry

12 home cooks observed in-kitchen over 3-hour cooking sessions

36 hours recorded

Sensor Data Analysis

Front camera proximity + ambient noise levels across cooking phases

2,400+ data points

Diary Study

7-day logging of "moments I wanted to tap but couldn't"

84 diary entries

Key Findings

Three patterns emerged consistently across all 12 participants. Voice commands failed most during the noisiest cooking phases — exactly when users needed hands-free control the most. Gesture recognition degraded dramatically with occluded or messy hands. But the most surprising finding was behavioral: most participants had already invented their own workaround.

🎤

Voice failure rate

Range hood: 65-80dB

👋

<40%

Gesture confidence

Flour, mitts, utensils

💪

9/12

Tried elbow taps

But 44px targets too small

Kitchen Noise vs. Voice Recognition Threshold

Idle

35dB

Prep

45dB

Chop

55dB

Sauté

65dB

Hood On

72dB

Frying

80dB

Voice threshold: 50dB

Diagram & Strategy

Adaptive UI State Machine

A multi-signal trigger system that gracefully degrades from standard input to proximity-based macro interaction.

State Transition Diagram

Trigger (Failed Input)

Sensor Signal

State Change

Fallback (No Change)

Try It: Trigger the Mode Switch

Click each condition to simulate the multi-signal evaluation. All three must be active to trigger Elbow-Bump Mode.

🎤

Voice Input

Click to simulate failure

👋

Gesture Confidence

Click to drop below 40%

📷

Proximity Sensor

Click to detect < 30cm

Default UI Active

Recipe Step 3/8

Elbow-Bump Mode

Next Step →

Tap anywhere

Design Rationale

Four core design decisions shaped the final system. Each one was driven by a specific failure we observed in the field — the multi-signal gate came from false activations in early prototypes, the oversized target from measuring real elbow contact areas, the semi-transparent overlay from users needing to see the recipe beneath, and the auto-timeout from preventing accidental lock-in.

🔀

Multi-Signal AND-Gate

Single trigger

23% false positive

Multi-signal

<3% false positive

👆

Touch Target: 120x Larger

Fingertip
8-12mm

50%

Elbow zone
40-60mm

👁

Semi-Transparent Overlay

Video visible

85% gradient → recipe still visible beneath button

⏱

Auto-Revert: 8s Timeout

Macro active 8s no proximity → revert

or double-tap to force exit

Validation

Testing with flour-covered hands

We ran a controlled usability study with 18 participants cooking actual recipes. Each participant completed the same 8-step recipe twice: once with standard UI, once with ProxiPlay adaptive mode.

Control: Standard UI

Standard 44px touch targets

Voice command only (no adaptive)

Must wash hands to tap screen

Test: ProxiPlay Adaptive

Multi-signal trigger detection

Auto Elbow-Bump Mode activation

50% screen macro touch target

Results

-91%

Hand-wash interruptions

From avg. 6.2 → 0.5 per recipe

-78%

Step transition time

From 15.2s → 3.3s avg.

97%

Elbow-tap success rate

On 50% screen macro button

“

I literally forgot I was using my elbow after the second step. It just felt natural—lean in, bump, done.

P7 · Home cook, 3 years experience

“

The big button appearing when I got close was like the phone reading my mind. I didn't have to think about how to interact.

P12 · Cooking instructor

Outcome

From concept to shipped feature

After three rounds of prototyping and the usability study with 18 participants, the adaptive mode was refined into a production-ready feature. The key transition — from standard controls to a half-screen elbow target — happens in under 400ms and feels invisible when it works well.

UI Transition: Default → Macro Mode

Default UI

Kimchi Jjigae

Step 3/8

Proximity
triggered

Elbow-Bump Mode

Next Step →

Tap anywhere

Scale Impact

Beyond the controlled study, we tracked engagement metrics after rolling out the adaptive mode to a 500-user beta group over 4 weeks. The data showed that removing the hand-wash friction didn't just save time — it fundamentally changed how people completed recipes.

Engagement Lift

Recipe completion rate+34%

Session duration+22%

Return rate (7-day)+41%

Technical Performance

Trigger accuracy97.2%

False activation rate2.8%

Mode switch latency< 400ms

Reflection

What I learned

Instead of adding more input modalities, the real breakthrough came from listening to what's already failing — and responding to that context.

Context is the best input — When voice fails and gestures fail, the system already knows the user needs help. Reading failure signals beats adding new modalities.

Body mechanics drive digital design — The 50% screen button only works because we measured real elbow contact areas (4.2 × 6.8cm) and forearm approach angles first.

Multi-signal > single threshold — Adding voice failure and gesture confidence as co-signals dropped false activations from 23% to under 3%. The AND-gate pattern is now my default.

Design Philosophy Shift

The conventional approach to input failure is to add more modalities — voice doesn't work? Add gesture. Gesture fails? Add gaze tracking. Each new channel adds complexity, edge cases, and cognitive load. ProxiPlay flipped this: instead of adding inputs, we read what's already failing and use that as the trigger to adapt the entire interface.

Old Approach

🎤

👋

👆

Add more input modalities

Stack voice, gesture, gaze, touch — hope one works. Each new layer adds latency, false positives, and user confusion.

Insight

New Approach

🎤✗

👋✗

⚙️

Listen to failures, then adapt

Failure signals are the input. Two failed modalities + proximity = the system already knows what to do. Zero new learning curve.

Where This Pattern Applies

Graceful degradation via multi-signal triggers isn't kitchen-specific. Any environment where users are physically separated from the interface — or where their hands, voice, or attention are compromised — benefits from the same AND-gate logic.

🍳

Kitchen Counter

Hands occupied, device 1-2ft away. Voice drowned out by 65-80dB range hood. Elbow is the only clean contact point.

Voice ✗ Touch ✗ Elbow ✓

📺

10-ft Living Room

Remote misplaced, voice misheard. Proximity of approach + failed voice triggers simplified on-screen controls — same AND-gate pattern.

Remote ✗ Voice ✗ Walk-up ✓

🏭

Industrial Floor

Heavy gloves block touch, machinery noise kills voice. Worker proximity + failed inputs triggers oversized panel controls.

Touch ✗ Voice ✗ Proximity ✓