Agent Interaction Lab

Eva Agent Model

How communication research becomes an avatar interaction

Eva is a demonstration of a practical idea: an agent should not only answer words, but also pay attention to how interaction unfolds. The core model builds on research in audiovisual prosody, spoken dialogue, feedback, turn-taking, communication problems and emotion.

Two roles

Eva first talks with you, then analyses the interaction

During the live minute, Eva's first task is simple: have a short, natural dialogue. She responds to the situation you describe, asks follow-up questions and adapts her tone when clear affect cues appear.

After the interaction, Eva switches role. She analyses the transcript for moments where communication may become easier, harder, more emotional, less clear or more repair-oriented. That second role is where the research categories become visible to the user.

What Eva Watches For

A practical translation of audiovisual prosody and interaction research

The current demo is text-first, but the conceptual model comes from spoken, audiovisual interaction. Eva watches how people show understanding, hesitation, emotion, emphasis, repair and turn completion. She treats feedback words, pauses, overlap, uncertainty markers, emotionally loaded phrases and finality cues as small interaction signals. During the conversation those signals can change her wording, pace and follow-up question; after the conversation they become visible as practical feedback categories.

Feedback

Eva looks for signs that people understand each other, need clarification, correct themselves, ask for grounding or show that a misunderstanding may be forming.

Turn-taking

Eva pays attention to pauses, overlap, interruptions and moments where someone may still be searching for words rather than handing over the turn.

Emotion and emphasis

Eva reads affective language, emphasis markers and emotionally loaded phrases as cues for adapting tone, pace and wording without turning that into a diagnosis.

Repair

Eva treats corrections, rephrasing, explicit disagreement and phrases such as "you do not understand me" as repair moments where shared understanding needs attention.

Finality cues

Eva looks for language and timing cues that suggest whether someone is finished, wants to continue, or needs a little more time before the agent responds.

Affect adaptation

Eva can slightly shift tone, speed and wording when joy, sadness, stress, anger, confusion or relief appear, while keeping the conversation focused on the situation described.

Uncertainty

Eva treats hesitation, mixed wording and uncertainty markers as cues to slow down, clarify and avoid overconfident conclusions.

Grounding

Eva looks for moments where examples, shared references or concrete next steps can make an abstract concern easier to discuss.

Conversation trouble

Eva notices when the interaction itself becomes the topic, such as confusion, frustration, repetition or a request to change approach.

Emotion Model

From emotion words to conversational adaptation

Eva uses a compact affect layer inspired by universal emotion families such as joy, sadness, anger, fear, disgust and surprise. For conversation, she also recognises broader affect cues such as stress, overwhelm, shame, confusion, relief, anxiety, irritation and loneliness.

This is not a diagnosis layer. It is a communication layer. When someone says "this is getting too much", Eva can become calmer and help narrow the next step. When someone says "you do not understand me", she treats it as a repair signal.

Research Basis

Audiovisual prosody in interaction is the research foundation behind Eva

These publications are relevant because Eva is about interaction: how people signal understanding, confusion, emotion, turn completion and communication trouble.

Doctoral thesis

Audiovisual prosody in interaction

Research on how voice and face contribute to feedback, turn-taking, communication problems and emotions.

Open publication
Human-machine interaction

Problem detection in human-machine interactions based on facial expressions of users

Research into audiovisual cues to communication problems between users and a spoken dialogue system.

Open publication
Turn-taking

The interplay between the auditory and visual modality for end-of-utterance detection

Research connected to whether someone has finished speaking or still wants to continue.

Open publication
Emotion

Crossmodal and incremental perception of audiovisual cues to emotional speech

Research connecting Eva's emotion layer to how emotional meaning is expressed across voice and visual cues.

Open publication

Additional References

Emotion and affect sources used as supporting layers

These sources add an extra affect layer to the interaction model: a compact way to map words and paraphrases to emotion families, tone choices and conversational responses.