When Humans and AI Teams Actually Work Together

Key insights from performance optimization experts Thomas O’Neill (University of Calgary), Beau Schelble (University of Tennessee), and Julie Shah (MIT) on building effective human-AI teams

The Fundamental Question That Changes Everything

As Dr. Thomas O’Neill opened his presentation, he cut straight to the heart of what’s transforming workplaces worldwide: “We have over a century of knowledge of high-performance work teams. We’d like to use some of that to inform our understanding of HATs (Human-Autonomy Teams), but it’s hard to use that if what we’re talking about is not an actual team.”

This isn’t just academic hair-splitting. It’s about designing collaborative systems that genuinely enhance human capability rather than simply automating tasks. The distinction between AI as a tool versus AI as a teammate fundamentally changes how we design, deploy, and optimize these systems for real-world impact.

The stakes extend far beyond individual productivity. As these researchers demonstrated, we’re witnessing the emergence of entirely new forms of collaboration that could either amplify human potential or create new forms of technological dependency.

Defining What Actually Makes a Team

The research community has established clear criteria that separate genuine human-AI teams from sophisticated automation. O’Neill’s systematic review identified the essential elements: “Each human and autonomous agent is recognized as a unique team member occupying a uniquely mission-critical role… Members strive to achieve a common goal as a collective.”

This technical distinction shapes everything that follows. For managers, it means moving beyond thinking about AI as advanced software toward considering it as a collaborative partner. For developers, it requires building systems that can engage in genuine teamwork behaviors, not just task execution. For workers, it suggests learning to coordinate with AI systems rather than simply operating them.

The research reveals a critical threshold: AI systems must demonstrate at least partial autonomy and meaningful interdependence with human team members to qualify as teammates rather than tools. This benchmark helps organizations identify when they’re dealing with genuine collaboration versus enhanced automation.

The Performance Paradox That Everyone Misses

Dr. Beau Schelble delivered a crucial insight that challenges how most organizations measure success: “AI teammates struggle to support and participate in the teamwork behaviors that coordinate that task work… if an AI teammate dominates the task work of a team by quickly accomplishing the tasks of multiple roles, teams are going to appear strong on paper, but the actual state of the team is going to include inefficiency and waste.”

This performance paradox reveals why many AI implementations fail despite impressive metrics. Organizations optimize for the wrong outcomes—speed, accuracy, throughput—while neglecting the collaborative processes that sustain long-term effectiveness.

Schelble’s research demonstrates that effective human-AI team performance requires multifaceted measurement: “Performance is multifaceted and measuring the effectiveness of your efforts needs to be holistic, thoughtful and context-dependent.” This includes both objective performance outcomes and affective outcomes that influence future collaboration.

The implications are profound. Teams that look successful by traditional metrics may actually be creating unsustainable dependencies and reducing human agency. The most effective approaches balance AI capabilities with human strengths rather than allowing AI to dominate collaborative processes.

Why Context Determines Everything

Dr. Julie Shah’s MIT research revealed how dramatically context shapes human-AI interaction. Her hospital deployment study showed nurses and physicians accepting an AI system’s recommendations 90% of the time—but the question that drives everything is whether 90% acceptance represents optimal collaboration or over-reliance.

Shah’s embodiment research uncovered critical differences: “There did appear to be sort of a type of saliency effect associated with an embodied system… there were different reliance and compliance behaviors of the embodied system. And in that work, it actually appeared that people were better able to calibrate their trust in the system when it was embodied.”

This finding challenges assumptions about virtual versus physical AI systems. Embodied AI may actually support better trust calibration, enabling more effective collaboration. However, Shah cautioned that “the more anthropomorphic, the more human-like the system is that’s providing you the recommendation, the more it may engender inappropriate trust.”

The research reveals nuanced design requirements. Context—physical, social, organizational—fundamentally shapes how humans perceive and interact with AI systems. One-size-fits-all approaches inevitably fail because they ignore these contextual dynamics.

The Trust Calibration Challenge That Defines Success

All three researchers emphasized trust calibration as the critical factor determining human-AI team effectiveness. Schelble explained: “Positive prior experiences working with an AI led to increased trust, reduced effort, and increased utility. All those negative prior experiences actually had the opposite effect.”

The challenge intensifies with AI’s “jagged frontier” of capabilities. As Shah noted from recent research: “For the creative task that was deemed to be within the current capabilities of AI… the consultants using the technology did see a 40% boost in their performance… However, for the analytical task, which is considered to be outside the AI’s capabilities, [users] were 19 percentage points less likely to arrive at the correct solution.”

This uneven capability profile makes trust calibration extraordinarily difficult. Users struggle to predict when AI will excel versus when it will fail catastrophically. The result is either over-reliance that leads to poor decisions or under-utilization that wastes AI capabilities.

The most successful implementations focus on building systems that help humans understand AI capabilities and limitations in real-time, rather than expecting humans to develop accurate mental models through trial and error.

Scale Changes Everything—In Ways We Don’t Expect

The complexity challenges multiply exponentially as human-AI teams scale beyond simple human-AI pairs. Schelble identified the core issue: “The increasing complexity of human AI teams, especially multi-agent interactions, will significantly expand the variance in performance outcomes for these teams.”

O’Neill’s research revealed unexpected dynamics in multi-agent environments: “In majority AI teams, humans would kind of sort out their shared cognition and coordination. And then from there, try to work with the other AI agents. So it’s always a little more complicated than you’d think.”

These findings challenge conventional wisdom about automation and teams. Rather than simply multiplying the benefits of human-AI collaboration, larger teams create new coordination challenges that require fundamentally different approaches to design and management.

Organizations deploying multiple AI systems need to prepare for emergent behaviors and coordination challenges that don’t exist in simpler configurations. The solutions require new frameworks for multi-agent coordination and human oversight.

The Industrial Integration Reality Check

O’Neill shared a sobering case study that reveals what happens when human-AI integration goes wrong: workers in an industrial setting began to “demean and try to outsmart the robots… call them stupid, scratch them, trick them out” when faced with unreliable automation and the threat of job displacement.

The researchers’ response was immediate and clear: “Nathan and I were really upset by this case, because we were very upset that this is really the desired future of humans and machines together. This was, in my mind, a completely failed experiment.”

This failure illustrates the critical importance of change management and human-centered design. Technical capabilities alone don’t determine success—the social and psychological dimensions of human-AI integration often prove decisive.

The contrast with successful implementations is striking. O’Neill described autonomous haul truck systems where “people who work with the autonomous trucks have had a chance to grow their skills in ways that they would never be able to otherwise. And every day they’re learning, they take a lot of pride in the work, and it’s a lot safer.”

The difference isn’t just technical—it’s about treating humans as partners in technological transformation rather than obstacles to be managed.

Moving Beyond the Tool versus Teammate Debate

The panel discussion revealed nuanced thinking about when AI systems function as tools versus teammates. O’Neill emphasized that the distinction depends on task interdependence: “In this task, in this context, in the way it’s been designed, both the system and the team task, do you need to rely on it to coordinate and collaborate to achieve your goals and still achieve a high level of performance?”

Shah offered a practical perspective from military applications: “He wants to move from tool to teammate in which he’s instead of saying, ‘go do this’ to the autonomy, he wants to say, ‘let’s go do this.’”

This linguistic shift—from “go do” to “let’s go do”—captures the fundamental difference between automation and collaboration. It requires AI systems that can engage in genuine coordination rather than simply executing pre-programmed responses.

The implications extend beyond individual interactions. Organizations need to design roles, processes, and incentives that support collaborative relationships rather than simple human-AI task division.

What Different Communities Should Do Now

⚙️ Technology Developers: Building True Collaborative Intelligence

Focus on Coordination Capabilities: Build AI systems that can engage in push-and-pull communication, share situation awareness, and adapt to human working patterns rather than just optimizing individual task performance.

Design for Trust Calibration: Implement transparency mechanisms that help humans understand AI capabilities and limitations in real-time, supporting appropriate reliance rather than blind trust or inappropriate skepticism.

Prioritize Complementarity: Create AI systems designed for roles that leverage AI strengths while preserving meaningful human roles, avoiding the trap of automation that marginalizes human contributions.

📈 Organizational Leaders: Designing Human-Centered AI Integration

Invest in Change Management: Recognize that successful human-AI teams require extensive change management, training, and cultural adaptation, not just technical implementation.

Measure Holistically: Develop performance metrics that capture both objective outcomes and affective states, ensuring that short-term efficiency gains don’t undermine long-term collaboration effectiveness.

Preserve Human Agency: Structure roles and processes to maintain meaningful human decision-making and expertise development, avoiding over-automation that creates dependencies and skills atrophy.

🔬 Researchers and Academics: Advancing Collaborative Intelligence Science

Study Long-term Dynamics: Investigate how human-AI relationships evolve over time, including trust repair, learning curves, and adaptation to changing capabilities and contexts.

Address Scale Challenges: Develop frameworks for understanding and managing complex multi-agent teams that go beyond simple human-AI pairs.

Bridge Theory and Practice: Create actionable guidance that translates research insights into practical design principles and implementation strategies for real-world applications.

👥 Workers and Professionals: Developing Collaborative AI Skills

Develop Coordination Skills: Learn to work with AI systems as teammates, including how to provide effective feedback, set appropriate boundaries, and maintain situation awareness in collaborative contexts.

Understand AI Capabilities: Build mental models of AI strengths and limitations to support appropriate trust calibration and effective task allocation in collaborative work.

Maintain Human Expertise: Continue developing uniquely human capabilities—creativity, ethical reasoning, social intelligence—that complement rather than compete with AI capabilities.

The Ethical Dimension That Shapes Everything

Shah emphasized the complexity of ethical AI development: “There’s no easy way for a single person to design an ethical system. It should be taken out of the hands of any individual… It needs to be done in a multi-stakeholder fashion from the earliest possible point in time.”

This multi-stakeholder requirement extends beyond technical development to ongoing governance and accountability. As AI systems become genuine teammates, questions of responsibility and liability become more complex than traditional automation scenarios.

The researchers emphasized the need for “an ethical process by which the risks and benefits and values are articulated, that input is given, and very importantly that there’s a feedback loop to change, adjust and correct.”

This isn’t just about compliance or risk management—it’s about building AI systems that genuinely serve human flourishing rather than optimizing narrow technical metrics at the expense of broader social values.

The Long View Reveals Unprecedented Opportunities

All three researchers emphasized that we’re still in the early stages of understanding human-AI collaboration. O’Neill noted the rapid growth in research: the field has expanded from just a few papers annually to steady double-digit growth, with 2024 showing 10 qualifying papers according to their systematic review criteria.

This research momentum suggests we’re approaching inflection points in our understanding of how to design and deploy effective human-AI teams. The convergence of insights from psychology, engineering, and robotics is creating new possibilities for collaborative intelligence that transcends the limitations of either human or AI capabilities alone.

Shah’s vision of the future emphasizes dynamic adaptability: “A system like this that can in real time dynamically profile, predict and then incorporate that behavior into its own planning for a collaborative goal” points toward AI systems that can truly learn and adapt to human collaborative patterns.

The opportunity before us is to shape this evolution intentionally, ensuring that human-AI collaboration enhances rather than diminishes human agency and capability.

Building the Collaborative Future We Need

These research insights reveal a future where the distinction between human and artificial intelligence becomes less important than their collaborative integration. The most successful implementations won’t simply add AI to existing workflows—they’ll reimagine work processes to leverage the complementary strengths of human and artificial intelligence.

The path forward requires moving beyond both technophobic resistance and uncritical automation toward thoughtful collaboration design. As Schelble emphasized in his closing soundbite: “Teams are already capable of really great things and integrating these new technologies… can advance what these teams are capable of… And it’s absolutely imperative that we do that integration correctly.”

Rather than waiting for perfect AI systems or hoping that technical capabilities alone will solve integration challenges, we can proactively build the frameworks, skills, and organizational cultures that support effective human-AI collaboration.

The question isn’t whether AI will become our teammate—it’s whether we’ll design collaborative relationships that truly enhance human potential. The research presented in this webinar provides a roadmap for getting it right.

This analysis emerges from a public webinar hosted by the Board on Human-Systems Integration on September 9, 2025. The discussion brought together three leading researchers working directly with human-AI teaming challenges:

Thomas O’Neill, Professor of Industrial and Organizational Psychology at the University of Calgary, focusing on the psychology of human-autonomy teaming and high-performance work teams.

Beau Schelble, Assistant Professor at the University of Tennessee and Director of the AI and Robotics for Collaborative Systems lab, specializing in performance optimization and trust in human-AI teams.

Julie Shah, H.N. Slater Professor and Head of Aeronautics and Astronautics at MIT, directing research on interactive robotics and human-robot collaboration.

Event Details
Date: September 9, 2025, 2:00-3:30 PM ET
Host: Board on Human-Systems Integration (BOHSI)
Format: Research presentation webinar with interactive discussion

When Humans and AI Teams Actually Work Together

The Fundamental Question That Changes Everything

Defining What Actually Makes a Team

The Performance Paradox That Everyone Misses

Why Context Determines Everything

The Trust Calibration Challenge That Defines Success

Scale Changes Everything—In Ways We Don’t Expect

The Industrial Integration Reality Check

Moving Beyond the Tool versus Teammate Debate

What Different Communities Should Do Now

⚙️ Technology Developers: Building True Collaborative Intelligence

📈 Organizational Leaders: Designing Human-Centered AI Integration

🔬 Researchers and Academics: Advancing Collaborative Intelligence Science

👥 Workers and Professionals: Developing Collaborative AI Skills

The Ethical Dimension That Shapes Everything

The Long View Reveals Unprecedented Opportunities

Building the Collaborative Future We Need

More insights

Why Your AI Benchmarks Are Lying to You

Why AI Won’t Save Your Broken System

When Patients Skip Doctors for AI Medical Advice

When Humans and AI Teams Actually Work Together