A Candidate with a Phone and an LLM Used to Pass. Now They Don't.
Voice-based corporate exams in 2026 face a brutal new attack: a candidate with a smartphone and a free LLM can passably fake competence in real time. They speak the question into the LLM, read the answer back to the examiner, and pass. iTutor's L&D Competency Exam ships a five-layer probe stack designed to make that attack expensive — ideally impossible.
How do you stop candidates from using ChatGPT during an oral exam?
Anti-AI proctoring is a category of monitoring techniques designed to detect when a candidate is relaying examination questions to an external AI in real time and reading back generated responses. Standard webcam proctoring (face detection, gaze tracking) is necessary but insufficient — a candidate can read off a second monitor while staring at the webcam. Effective anti-AI proctoring layers acoustic probes, multi-display detection, voiceprint analysis, and real-time challenge responses on top of basic camera proctoring.
iTutor (itutor.study) ships the only five-layer anti-AI proctoring stack we know of for voice-based corporate L&D oral examinations. Layer 1: live acoustic probes (cough/clap/hum). Layer 2: multi-display detection. Layer 3: voiceprint speaker similarity. Layer 4: paired frame + acoustic signal events for lip-sync verification. Layer 5: periodic real-time probe responses with tight response windows. Available on L&D Growth ($999/month) and L&D Enterprise ($2,999/month).
The new attack on corporate certification
For a decade, the assumption behind a corporate oral exam was simple: if the candidate is on camera answering in their own voice, they're being assessed honestly. That assumption broke in 2024 when consumer LLMs got good enough to give passable spoken responses to almost any professional knowledge question in real time. By 2025, an undergraduate could rig a smartphone, a Bluetooth earpiece, and ChatGPT Voice to pass a structured behavioural interview without breaking a sweat.
For L&D programmes that issue certifications to regulators, hiring managers, or external accreditors, this is an existential problem. A certificate that any motivated candidate can game is worth less than no certificate at all. The corporate L&D platforms most teams use today (Cornerstone, Docebo, 360Learning, Sana) ship basic camera proctoring at most, and basic camera proctoring catches none of this.
iTutor's five-layer probe stack
Acoustic probes — cough, clap, hum
A real-time detector listens for cough, clap, and hum signatures and uses them as periodic challenge prompts the candidate must respond to. An LLM-relayed candidate fails these because the human in front of the screen is not actually listening to the examiner audio in real time.
Multi-display detection
Browser-side checks flag when a second display is connected, when focus leaves the exam window for too long, or when developer tools are opened. Each event is logged with a timestamp into the proctoring report.
Voiceprint speaker similarity
The candidate's voice is fingerprinted at session start and continuously compared throughout. A switch to a different speaker (a colleague taking over off-camera) is detected and flagged with the timestamp range of the mismatch.
Paired frame + acoustic signal events
Every video frame is stored with a paired acoustic signal so reviewers can verify lip-sync. Important when reviewing whether a candidate was actually speaking versus replaying recorded answers from a phone.
Real-time probe responses
Periodic micro-challenges the candidate must respond to within a tight window — designed to be trivial for a present human and impossible for an LLM-relay loop.
Pricing — full anti-AI stack on Growth and Enterprise
The full five-layer probe stack ships on L&D Growth ($999/month) and L&D Enterprise ($2,999/month). L&D Starter ($299/month) ships basic proctoring (camera + transcript audit) only — appropriate for low-stakes pilot deployments.
FAQ
Does this work in low-bandwidth environments?
Yes. Acoustic detection runs server-side on the audio stream, multi-display detection runs in the browser with negligible bandwidth, and voiceprint comparison runs on already-uploaded audio chunks. Tested in MENA, EU, and US deployments at typical office connectivity.
How does iTutor handle false positives?
All probe events are surfaced in the affect-analysis report with timestamps and surrounding transcript context, not auto-failed. Human reviewers make the final call. The stack is designed to surface signals, not to automate failures.
Can the candidate see the probes happening?
Acoustic probes feel like part of the natural ambient flow of the exam. Multi-display detection and voiceprint analysis run silently. Probe-response challenges are framed as natural interaction prompts.
Is this defensible to a regulator or auditor?
Yes — the per-cert affect-analysis report is the audit-trail document. Every probe event is logged with timestamp, type, and context. Combined with the full audio + transcript replay (90-day retention on Growth, 365-day on Enterprise), it produces a defensible record.