Bridging the Relevance Gap:
AI-Driven Experiential Learning
for the Future of Work
A formal analysis of how simulation-based pedagogy, transparent AI assessment, and stakeholder complexity design prepare students for organizational leadership in an era of technological transformation. The Future Work Academy platform applies these principles across a growing library of simulations; this paper uses the flagship module — The Future of Work at Apex Manufacturing — as its worked example.
Doug Mitchell, Ed.D. Candidate · Future Work Academy · 2025
Section 1
The Relevance Gap
Management education has long confronted a persistent disconnect between what is taught in classrooms and what is practiced in organizations. Hay and Heracleous identified this as a fundamental failure of business school pedagogy to prepare leaders for the ambiguity of real-world decision-making.5 Bartunek and Rynes deepened this critique by documenting the paradoxical relationship between academic researchers and practitioners: despite shared goals, the two communities operate with divergent time horizons, success metrics, and knowledge validation standards.1
The emergence of artificial intelligence as a transformative organizational force has amplified this gap. The 2025 UNDP Human Development Report warns that AI-driven workforce displacement will affect not only routine tasks but also knowledge-intensive professional roles — the very roles MBA graduates expect to fill.12 Traditional case-based teaching, while valuable for developing analytical reasoning, offers students static snapshots of organizational life. Cases do not compound. Decisions made in Week 1 do not alter the strategic landscape of Week 4. There are no stakeholders who remember what you promised.
The Future Work Academy platform was designed to address this gap directly — creating a dynamic, multi-week simulation environment where students experience the compounding consequences of strategic leadership through AI-assisted workforce transformation.
Section 2
Theoretical Framework
The platform's pedagogical design draws on four complementary learning theories, each addressing a different dimension of the gap between classroom instruction and organizational practice.
Experiential Learning Cycles9, 8
Kolb's four-stage cycle — concrete experience, reflective observation, abstract conceptualization, and active experimentation — provides the structural backbone. The 8-week simulation repeats this cycle iteratively, directly addressing Kayes's critique that single-iteration case studies fail to capture the developmental arc of experiential learning.
Productive Failure6, 7
Kapur's research demonstrates that students who struggle with complex, ill-structured problems before receiving direct instruction generate deeper understanding and superior transfer performance. The simulation's compounding metric system — where early miscalibrations create progressively more constrained decision spaces — embodies this principle.
Situated Cognition3
Knowledge is most effectively acquired when embedded in authentic activity, context, and culture. The simulation's CEO role, 17 stakeholders with quantified personality traits, and industry-sourced intelligence articles create a situated learning environment where strategic reasoning is inseparable from organizational context.
Scaffolded Complexity13
The three-tier difficulty system (Standard, Advanced, Expert) progressively reduces scaffolding as student capability increases — fewer advisor consultations, tighter crisis thresholds, higher minimum performance expectations — implementing Vygotsky's Zone of Proximal Development through structured support withdrawal.
Together, these frameworks create what Biggs17 describes as constructive alignment: the learning activities (weekly decisions), assessment method (rubric-based AI evaluation), and intended outcomes (strategic reasoning under uncertainty) are internally consistent and mutually reinforcing.
Section 3
Platform Design
The Future Work Academy platform supports a growing library of multi-week simulations, each built on the same engine: weekly decision arcs, a stakeholder cast with quantified traits, transparent rubric-based AI grading, and a Simulation Builder co-pilot (Akme) that lets educators author additional scenarios. The flagship module described in this section — The Future of Work at Apex Manufacturing — places students in the role of CEO at a mid-sized industrial firm navigating AI-driven workforce transformation. Over 8 simulated weeks, participants confront escalating strategic challenges — from initial automation investment decisions through union negotiations, workforce displacement, and competitive disruption. Subsequent modules apply the same architecture to other industries and dilemmas.
8-Week Simulation Arc
3-Tier Difficulty Scaffolding
Following Wood, Bruner, and Ross's scaffolding framework13, the platform offers three difficulty tiers that progressively reduce instructional support:
Full access to all advisors, relaxed crisis thresholds, detailed guidance prompts
Limited advisor consultations per week, tighter performance thresholds, reduced scaffolding
Minimal advisor access, aggressive crisis triggers, no guidance prompts — maximum productive failure
Stakeholder System & Salience
The simulation features 17 stakeholders across five organizational departments, each with quantifiable personality traits — influence, hostility, flexibility, and risk tolerance — that mirror Mitchell, Agle, and Wood's stakeholder salience framework of power, urgency, and legitimacy.10 These traits are not decorative; they algorithmically determine how each stakeholder reacts to the student's strategic decisions, creating an authentically complex organizational environment.
Section 4
AI-Assisted Formative Assessment
The assessment architecture is grounded in three interconnected research traditions: formative assessment, feed-forward feedback, and automated essay scoring.
Rubric Transparency
Black and Wiliam's seminal meta-analysis established that student achievement improves significantly when learners understand evaluation criteria before performing tasks.2 The platform implements this finding literally: the four-criterion rubric (Evidence Quality, Reasoning Coherence, Trade-off Analysis, Stakeholder Consideration — 25 points each) is displayed on every weekly decision page while students compose their responses.
Scoring Calibration
Drawing on Shermis and Burstein's research on automated essay scoring11, the AI evaluation engine is calibrated against exemplar responses to achieve inter-rater reliability comparable to trained human evaluators. Scores map to published quality bands — Excellent (93–100%), Good (72–92%), Adequate (52–71%), Poor (<52%) — with per-criterion breakdowns that identify specific strengths and improvement areas.
Feed-Forward Feedback
Hattie and Timperley's model of effective feedback4 distinguishes between feedback about the task, the process, and self-regulation. The platform's per-criterion commentary after each weekly submission is designed to inform the next performance rather than merely evaluate the last — creating what Hattie and Timperley term "feed-forward" loops across the 8-week arc.
Crucially, instructors retain full override authority. Every AI-generated score can be reviewed, adjusted, and annotated before grades are finalized — ensuring that automated assessment augments rather than replaces professional pedagogical judgment.
Section 5
Stakeholder Complexity
One of the platform's most distinctive design features is its stakeholder system: 17 individuals spanning five departments (Executive Leadership, Finance & Legal, Operations & HR, Production Floor, External Stakeholders), each modeled with four quantifiable behavioral dimensions.
Determines how much a stakeholder's opinion shifts organizational metrics and narrative outcomes
Controls the difficulty of scenarios associated with that stakeholder; high-hostility stakeholders create more constrained decision spaces
Governs how a stakeholder responds to disruptive or unconventional strategic choices
Shapes the stakeholder's appetite for bold, high-variance strategic moves versus conservative approaches
These trait dimensions are directly informed by Mitchell, Agle, and Wood's stakeholder salience theory10, which posits that managers allocate attention to stakeholders based on their perceived power, urgency, and legitimacy. The stakeholder system translates this theoretical framework into computational mechanics: high-influence, high-hostility stakeholders demand attention precisely because they possess both the power and the urgency to disrupt organizational stability.
The relationship mapping across stakeholders creates organizational tension that mirrors real leadership challenges. A decision that satisfies the CFO's risk-averse financial priorities may alienate the union representative's workforce protection demands. Students learn to navigate these competing interests iteratively, building stakeholder management capabilities that transfer directly to professional practice.
Section 6
Radical Transparency
Transparency is not a marketing position — it is a design principle embedded at every layer of the platform. In an era of growing concern about AI opacity in educational settings, the platform takes the opposite approach: every evaluation mechanism, scoring algorithm, and assessment criterion is published and visible to all stakeholders.
Published Methodology
The complete AI grading methodology — rubric criteria, scoring bands, calibration process, and quality thresholds — is published on a dedicated public page accessible without authentication.
Visible Rubrics During Task
Students see the exact four-criterion rubric while composing their weekly decision essays. There are no hidden criteria, no secret weighting schemes, no post-hoc evaluation dimensions.
Instructor Override
Every AI-generated score is a recommendation, not a verdict. Instructors can review, adjust, annotate, and override any grade before finalization, maintaining human pedagogical judgment as the ultimate authority.
Optional Curved Scoring
When enabled, the platform applies statistical normalization using a configurable class curve. The curve parameters, methodology, and effect on individual scores are fully visible to instructors.
This commitment to transparency is rooted in both pedagogical evidence — Black and Wiliam's finding that criterion visibility improves learning2 — and ethical conviction: students have a right to understand how they are being evaluated, especially when AI systems are involved in the assessment process.
Section 7
Future Research
The platform's built-in assessment infrastructure creates opportunities for rigorous empirical research on simulation-based pedagogy. Several research directions are currently in development.
Validated Survey Constructs14, 15, 7
The platform includes a 9-dimension student feedback survey measuring engagement, complexity appreciation, decision confidence, feedback quality, platform usability, overall satisfaction, self-efficacy (per Bandura, 1997), transfer confidence (as a Kirkpatrick Level 3 proxy), and productive struggle (per Kapur, 2016). These constructs are designed for pre/post administration to capture developmental change across the simulation arc.
Pre/Post Assessment Design
A quasi-experimental design comparing students' strategic reasoning quality before and after the 8-week simulation, using rubric-scored essay responses as the dependent measure. This design controls for prior ability through baseline assessment while measuring growth in evidence quality, reasoning coherence, trade-off analysis, and stakeholder consideration.
Cross-Institutional Studies
The platform's cloud-based architecture and privacy-mode enrollment (which requires no personally identifiable information) enables multi-site deployment across universities, community colleges, and corporate training programs. Comparative studies across institutional types, student populations, and disciplinary contexts will test the generalizability of simulation-based experiential learning outcomes.
AI Scoring Reliability11
Ongoing calibration research compares AI-generated rubric scores against expert human raters to establish and maintain inter-rater reliability benchmarks. This work builds directly on Shermis and Burstein's automated essay scoring foundations and extends them into the domain of strategic reasoning assessment.
These research directions are designed to produce publishable empirical evidence that advances both the platform's effectiveness claims and the broader field of simulation-based management education.
References