wide_mouth_detection_guide.md 4.6 KB

Wide-Open Mouth Detection - Testing Guide

🎯 Detection States

The enhanced OCTv2 now classifies mouth states into 4 categories:

🟢 WIDE_OPEN (TARGET!)

  • What it detects: Mouth wide open like saying "AHHH" or yawning
  • Requirements:
    • Inner mouth aspect ratio > 0.6
    • Outer mouth aspect ratio > 0.4
    • Significant lip separation (>8 pixels)
  • Visual: Thick green border + "🎯 TARGET!" label
  • Action: OCTv2 WILL FIRE at these mouths

🟠 SPEAKING

  • What it detects: Normal speech, moderate mouth opening
  • Requirements:
    • Inner mouth aspect ratio > 0.3
    • Moderate lip separation (3-8 pixels)
  • Visual: Orange border
  • Action: IGNORED - no firing

🟡 SMILING

  • What it detects: Smiles, grins, wide but closed mouths
  • Requirements:
    • Wide mouth but minimal vertical opening
    • Mouth corners raised above center
  • Visual: Cyan border
  • Action: IGNORED - no firing

⚪ CLOSED

  • What it detects: Normal closed mouth, neutral expression
  • Visual: Gray border
  • Action: IGNORED - no firing

🧪 Testing Protocol

Step 1: Basic Detection Test

# Run this test to see all mouth states
python3 octv2_server_v2.py

# In another terminal, test mouth states:
# 1. Keep mouth closed -> Should show "CLOSED" (gray)
# 2. Smile wide -> Should show "SMILING" (cyan)
# 3. Say "hello" -> Should show "SPEAKING" (orange)
# 4. Open mouth wide (say "AHHH") -> Should show "WIDE_OPEN" (green + target)

Step 2: Targeting Test

# Put app in AUTO mode and test:
# 1. Smile at camera -> Should NOT fire
# 2. Talk to camera -> Should NOT fire
# 3. Open mouth wide -> Should aim and fire after 2 seconds

Step 3: Fine-Tuning

If detection is too sensitive/not sensitive enough, edit these values in octv2_server_v2.py:

# In _analyze_mouth_state method:

# WIDE_OPEN thresholds (make stricter = increase values)
if (inner_aspect_ratio > 0.6 and      # Try 0.7 for stricter
    outer_aspect_ratio > 0.4 and      # Try 0.5 for stricter
    avg_lip_thickness > 8):            # Try 10 for stricter

# SPEAKING thresholds (to avoid false positives)
elif (inner_aspect_ratio > 0.3 and    # Try 0.4 to reduce speaking detection
      outer_aspect_ratio > 0.2 and
      avg_lip_thickness > 3):

📊 Expected Results

Perfect Wide-Open Mouth:

State: WIDE_OPEN
Confidence: 0.8-1.0
Inner Ratio: >0.6
Outer Ratio: >0.4
Lip Separation: >8px

Speaking/Talking:

State: SPEAKING
Confidence: 0.4-0.8
Inner Ratio: 0.3-0.6
Outer Ratio: 0.2-0.4
Lip Separation: 3-8px

Big Smile:

State: SMILING
Confidence: 0.3
Wide mouth, corners raised
Minimal vertical opening

🎮 Visual Feedback in App

When using the iOS app in AUTO mode, you'll see:

  • All faces detected with colored rectangles
  • Real-time state classification (CLOSED/SPEAKING/SMILING/WIDE_OPEN)
  • Confidence scores for each detection
  • Target indicators only for WIDE_OPEN mouths
  • Face/Target counters in top-left corner

🔧 Common Adjustments

Too Many False Positives (fires at speaking/smiling):

# Increase WIDE_OPEN thresholds
inner_aspect_ratio > 0.7        # Was 0.6
avg_lip_thickness > 10          # Was 8

Missing Real Wide-Open Mouths:

# Decrease WIDE_OPEN thresholds
inner_aspect_ratio > 0.5        # Was 0.6
avg_lip_thickness > 6           # Was 8

Poor Lighting/Distance Issues:

# Adjust pixel-based thresholds based on camera distance
avg_lip_thickness > 12          # For closer subjects
avg_lip_thickness > 5           # For farther subjects

🎯 Optimal Target Poses

Best Targets (will fire):

  • "AHHH" sound - wide open, relaxed
  • Yawning - maximum opening
  • Surprised expression - mouth wide with shock
  • Dentist position - deliberately wide open

Non-Targets (will ignore):

  • Normal conversation - moderate opening
  • Laughing - usually more smile than wide-open
  • Singing - varies, often not wide enough
  • Any closed-mouth expression

🍪 Safety Notes

  • 2-second cooldown between automatic shots
  • Only fires at WIDE_OPEN classification
  • Manual override always available
  • Emergency stop via STOP command

🎪 Fun Testing Ideas

  1. Challenge friends to get the system to fire
  2. See who can trigger it fastest with wide-open mouth
  3. Test different expressions to understand boundaries
  4. Fine-tune for your specific use case (kids vs adults, etc.)

Your OCTv2 now has precision targeting that only fires at genuinely wide-open mouths! 🎯🍪