How It Works
The mechanics: dual audio capture, real-time speech-to-text, latency, and how answer suggestions are generated.
This cluster is for people who want to understand the pipeline before they trust their interview to it. Reasonable.
End-to-end, an answer suggestion takes four steps: capture, transcribe, generate, render. Capture is OS-native — ScreenCaptureKit (macOS) or WASAPI (Windows) — pulling system audio at the OS level so the AI hears the interviewer the way your speakers do. The microphone is captured separately so the AI also has your audio for context and for the post-interview transcript. Transcription is real-time speech-to-text. Generation passes the question plus your resume, the job description, and the conversation history so far to GPT-4o, with a system prompt that constrains output to interview-appropriate length. Rendering streams the answer into a floating overlay window that exists outside the conferencing app's window — you can drag it anywhere, including off the screen-share area.
The end-to-end first-token latency budget is sub-400 milliseconds. Past that point your eyes shift off-camera while you read the answer, which defeats the purpose. The answers below cover each stage in detail, what happens when the budget is exceeded, and the trade-offs we picked. (For the deeper why-we-built-it context, see the founder letter.)
- Can recruiters use an AI interview assistant when applying for new roles?
- Can I use an AI interview assistant during a phone interview?
- How do AI interview assistants capture system audio on iOS?
- What are the real limitations of an interview copilot or AI interview assistant?
- What is a real-time interview copilot and how does it work?
- What is an AI interview answers generator and how does it work?
- How does an AI generate interview answer suggestions in real time, during a live interview?
- How much latency does an AI interview assistant add during a live interview?
- How do AI interview assistants capture system audio during a video interview?
- Can an AI interview assistant transcribe both the interviewer and the candidate?
- What is an interview copilot and how does it work?
- What is an AI interview assistant and how does it work?
- Where is the best place to position the AI interview assistant overlay on my screen during a real interview?
- Does an AI interview assistant need a browser extension to work during a live interview?
- How does real-time interview speech-to-text work?