Research Briefing

AI Alignment: A New Management Paradigm

Safe, large-scale AI deployment requires that leaders manage three interdependent states of consistency—scientific, application, and stakeholder—amidst dynamic forces.

By Barbara H. Wixom, Ida A. Someh, and Robert W. Gregory

Abstract

Since 2019, MIT CISR has investigated fifty-two AI solutions—applied analytics models that have some level of autonomy. This research has identified that safely deploying AI solutions at scale requires an adaptive management approach we call AI Alignment. AI Alignment involves achieving—amidst dynamic, changing forces—three interdependent states of consistency: scientific consistency, between reality and model; application consistency, between model and solution; and stakeholder consistency, between solution and stakeholder needs. In this briefing, we illustrate AI alignment and share supportive practices from AI projects at General Electric, the Australian Taxation Office, and Satellogic.

Globally, companies are expected to spend an estimated $97.9 billion on AI in 2023.[foot]International Data Corporation (IDC), “Worldwide Spending on Artificial Intelligence Systems Will Be Nearly $98 Billion in 2023, According to New IDC Spending Guide,” IDC press release, September 4, 2019, https://www.idc.com/getdoc.jsp?containerId=prUS45481219.[/foot] While some of the world’s largest companies such as Amazon and Alphabet thrive using AI, the majority of companies making AI investments do not report positive returns.[foot]Sam Ransbotham, Shervin Khodabondeh, Ronny Fehling, Burt LaFountain, and David Kiron, “Winning with AI,” MIT Sloan Management Review, October 15, 2019, https://sloanreview.mit.edu/projects/winning-with-ai/.[/foot] Since 2019, MIT CISR has investigated fifty-two AI solutions[foot]In 2019–2020, we conducted 100 interviews with 38 domain experts, 49 data scientists, and 13 consultants regarding their experiences with 52 distinct AI projects at 48 companies. Of these projects, 31 were deployed, 15 were in pilot, and 6 were under development.[/foot]—applied analytics models that have some level of autonomy. We’ve found that the hallmark of successful AI is an adaptive management approach that enables companies to safely deploy a scientific solution at scale in ways that meet diverse stakeholder needs; we call this approach AI alignment.

The basis for AI alignment involves achieving three interdependent states of consistency amidst dynamic, changing forces (see figure 1):

Scientific consistency produces an AI model that can generate accurate outcomes
Application consistency creates an AI solution that can achieve goals in situ over time
Stakeholder consistency generates AI benefits across a network of stakeholders

Figure 1: AI Alignment

The goal is ultimately to achieve and maintain alignment across the three states, which is challenging due to dynamic forces inside and outside of the states that require adaptation within and across them over time. For example, a change to a setting could compel a change in an algorithm, a change in reality (such as an unexpected pandemic) could lead to a change in stakeholder needs—and due to the states’ interdependency, a change within one state may require adjustments to the others.

Fortunately, AI project teams are learning what AI alignment entails; thirty-one of the solutions we studied had been successfully deployed. In this briefing, we illustrate scientific, application, and stakeholder consistency, and share supportive practices from AI projects at General Electric, the Australian Taxation Office, and Satellogic.

Scientific Consistency at General Electric

In order for an AI model to solve a real-world problem, it needs to be trained to represent reality. The validated state is what we call scientific consistency. AI project teams achieve this state by comparing the output of a model with a reality surrogate, such as a domain expert or confirmatory evidence—empirical phenomena also known as “ground truth.” Comparing activities are key to achieving scientific consistency; these activities illuminate model mechanics and performance to identify model-solution mismatch, which teams remediate by adjusting data, features, the algorithm, or domain knowledge.

GE’s Corporate Environment, Health, and Safety (EHS) team delivered company-wide governance and oversight for the unit’s area of focus.[foot]B. H. Wixom, I. A. Someh, and C. M. Beath, “GE’s Environment, Health, and Safety Team Creates Value Using Machine Learning,” MIT Sloan CISR Working Paper No. 448, November 2020, https://cisr.mit.edu/publication/MIT_CISRwp448_GEMachineLearning_WixomSomehBeath.[/foot] In 2016, a team of EHS leaders formulated a set of standards it called Life Saving Principles (LSP) that would guide work practices for high-risk operations. To confirm that GE contractors had robust EHS programs in place, GE expanded its contractor onboarding process to include LSP oversight.

The contractor onboarding process was exacting and labor intensive: hundreds of GE EHS professionals vetted the approximately 80,000 contractors the company hired annually. By 2020, GE had developed and implemented an AI-enabled Contractor Document Assessment (CDA) application that served all GE EHS professionals as a bolt-on application for use during the contractor onboarding process. Delivering the tool began to free up EHS professionals to focus more of their expertise on field execution and higher-value EHS work.

At the outset, a team of data scientists, digital/IT professionals, and EHS experts selected a natural language processing algorithm that required EHS experts to develop from scratch a list of words and phrases that the algorithm would draw on for modeling. The chosen technique required a minimum of 600 contractor documents for model training; however, at that time only about 300 documents existed with classification labels, and of those only 230 were in a machine-readable format. Therefore, the team amassed properly digitized documents, and EHS experts read through the documents from start to finish and labeled them with annotations. The experts identified key words and terms that indicated whether LSP requirements were met or not, and why; and they classified each document on whether or not it satisfied the LSP. The team had expected the project to take three months, but building the training data set alone required more than six months’ time.

To facilitate model training and development, developers created a user interface that listed applicable LSP requirements for each document, indicating whether the document passed (displayed in blue) or failed (displayed in red) along with the actual probability (displayed as a percent) of that outcome being accurate. The reviewers then could drill down into the document to inspect it firsthand. When a reviewer disagreed with the machine’s assessment, they could comment regarding evidence the model should have considered and specify where in the document that evidence was located, and override the machine learning decision of satisfied/not satisfied per requirement. The interface provided access to all the review history across evaluators. It displayed a dashboard that monitored what was rejected, what was reviewed, and what the decision was.

Over time, the comparing activities led to model adjustments and retraining, which helped the model become more accurate and eliminate false negative results. Also, through the activities, the team exposed that domain experts sometimes made errors in judgment and that different reviewers sometimes interpreted document text in a slightly different way. With that understanding, the project team developed evaluator feedback and education.

Application Consistency at the Australian Taxation Office

AI project teams aim to deploy AI solutions that achieve intended goals and avoid unintended consequences. To do this, a team first must fully understand the people, processes, and technology involved in the target context—and the way in which the model will interact when contextually applied. Armed with this understanding, the team can scope a solution and direct it toward desired outcomes. Scoping activities are critical for safe AI solution deployment; they illuminate solution impacts and consequences to identify application-solution mismatch, which teams remediate by adjusting the model’s restrictions, boundaries, automation, and oversight.

The Australian Taxation Office (ATO) was in 2020 one of the largest government departments in Australia; it collected $426 billion in net tax collections for the 2018–2019 financial year.[foot]I. A. Someh, B. H. Wixom, and R. W. Gregory, “The Australian Taxation Office: Creating Value with Advanced Analytics,” MIT Sloan CISR Working Paper No. 447, November 2020, https://cisr.mit.edu/publication/MIT_CISRwp447_ATOAdvancedAnalytics_SomehWixomGregory.[/foot] In 2015, ATO launched the Smarter Data program to increase its data analytics capabilities, and beginning in the 2017 tax period, ATO’s myTax system included a real-time analytics and nudging capability to support compliant claim lodgment[foot]In Australia, tax filings were referred to as tax lodgments.[/foot] for work-related expenses. During the 2018 tax period, nearly 240,000 taxpayers were prompted, through a myTax pop-up message, to review a specific label. The messaging resulted in taxpayers adjusting the associated labels by around $113 million.

ATO’s AI solution was intended to help the agency reduce taxpayer non-compliance; however, ATO had to withstand regulatory scrutiny, intended to ensure that the solution would act in the best interests of the government and its citizens. As such, the project team selected nearest neighbor analysis as a foundational solution element because the algorithm’s logic was straightforward to understand, explain, and justify. The team later introduced a neural net algorithm into the solution to support real-time processing; however, the team chose not to enable continuous model learning for the neural net through a tax period. In this way, the AI solution would produce consistent results regardless of when a citizen chose to submit a claim, and the results could be replicated.

Key deployment decisions for the AI solution reflected ATO principles. For example, the ATO did not engage in policing efforts; as such, ATO’s behavioral analytics team helped develop gentle, respectful nudging techniques to encourage productive claim behaviors as citizens entered information in myTax. If a discrepancy reached a specified threshold, a message would appear on the taxpayer’s screen, prompting them to check their figures.

Stakeholder Consistency at Satellogic

AI solutions rely on and serve a diverse set of stakeholders, such as managers, frontline workers, investors, customers, partners, citizens, and regulators. Ideally, AI solutions create benefits that are expected and desired across a wide, varied range of stakeholders. Stakeholder consistency occurs when an AI solution offers a value proposition that stakeholders understand, support, and benefit from. Value-creating activities help AI solutions have fruitful impact. These activities illuminate model benefits, costs, and risks and stakeholder attitudes to identify value-solution mismatch. Teams remediate mismatch by adjusting communication and change management, incentives, and the solution’s value proposition.

In 2020, Satellogic was a 180-person geospatial analytics company headquartered in Buenos Aires, Argentina.[foot]B. H. Wixom, D. E. Leidner, R. M. Ionescu, and I. A. Someh, “Satellogic: Moving from AI Solutions to AI Products,” MIT Sloan CISR Working Paper No. 446, November 2020, https://cisr.mit.edu/publication/MIT_CISRwp446_SatellogicAIProducts_WixomLeidnerIonescuSomeh.[/foot] Satellogic combined proprietary satellite data and advanced analytics techniques to solve hard global problems such as how to increase food production or efficiently generate energy. In 2016 Satellogic began launching its satellite constellation, and in 2019 leaders secured $50 million in funding to scale the constellation and boost the company’s technology and product development. Initially, Satellogic sought clients that were “friendly”—willing to share risks associated with innovating new solutions. An early Satellogic client was a Chilean holding company, HoldCo,[foot]This company name has been anonymized to protect its confidentiality.[/foot] that contracted with farmers to supply rapeseed. Satellogic pitched to HoldCo that it could predict rapeseed crop location using a satellite imagery-based technology solution—and that the solution would be cheaper and more accurate than HoldCo’s existing manual approach.

HoldCo’s internal agricultural advisors doubted that the new approach was viable. Satellogic asked HoldCo for help with the model training process, and the HoldCo specialists guided the Satellogic team in labelling satellite images (as to whether or not an image contained a rapeseed crop), training the model, and validating model output. Over time, the AI model became better at detecting rapeseed—and HoldCo became more open to the viability of rapeseed detection from space. In the process, the Satellogic team learned about challenges associated with satellite imagery, such as how cloudy weather resulted in reduced image data quality. The model performance grew to detect rapeseed crops three months prior to harvest with 80 percent accuracy. Although initially HoldCo expressed interest in simply detecting rapeseed crops, they grew to desire crop prediction after convinced of the viability of early detection.

The remote sensing approach detected biophysical parameters that reflected the health of a crop; HoldCo formulated ways to use the crop health information to help farmers with whom the company established deals. Farming advice was a compelling reason for farmers to accept a new remote sensing-based crop-planning process.

A key Satellogic role for the HoldCo relationship was a domain expert who had agricultural experience and who educated the client about the technical mechanics of the analytics at play—what they did, how they worked, and why they generated specific insights. He also managed the “client last mile”—the gap between outcomes from AI and their application by the client. Generally speaking, most remote-sensing AI solutions included a map of results, such as the locations of rapeseed fields; however, Satellogic intended for its solution to integrate into the client’s systems.

AI Alignment: A New Management Paradigm

In safely deploying AI solutions at scale, leaders are called to guide three interdependent states of consistency and the activities that enable them: scientific consistency, underpinned by comparing activities; application consistency, facilitated by scoping activities; and stakeholder consistency, promoted by value-creating activities. Further, AI alignment requires that leaders manage the forces that shape and are shaped by an AI model core. This core is best viewed as a living actor that constantly evolves to reflect a changing reality. Successful leaders will embrace the dynamism of and incorporate new activities that sustain AI solutions and create virtuous cycles of learning and adaptation.

© 2020 MIT Sloan Center for Information Systems Research, Wixom, Someh, and Gregory. MIT CISR Research Briefings are published monthly to update the center's patrons and sponsors on current research projects.

About the Authors

Barbara H. Wixom, Principal Research Scientist, MIT Sloan Center for Information Systems Research (CISR)

E94-1553

bwixom@mit.edu

(617) 452-5401

View Profile

Ida A. Someh, Lecturer in Business Information Systems, University of Queensland

View Profile

Robert W. Gregory, Verizon Associate Professor of Commerce, McIntire School of Commerce, University of Virginia

View Profile

MIT CENTER FOR INFORMATION SYSTEMS RESEARCH (CISR)

Founded in 1974 and grounded in MIT's tradition of combining academic knowledge and practical purpose, MIT CISR helps executives meet the challenge of leading increasingly digital and data-driven organizations. We work directly with digital leaders, executives, and boards to develop our insights. Our consortium forms a global community that comprises around seventy organizations.

MIT CISR Patrons

AlixPartners

Avanade

Cognizant

Collibra

IFS

MIT CISR Sponsors

ABN Group

Alcon Vision

Amcor

ANZ Banking Group (Australia)

AustralianSuper

Banco Bradesco S.A. (Brazil)

Banco do Brasil S.A.

Bank of Queensland (Australia)

Barclays (UK)

BNP Paribas (France)

Bupa

CarMax

Caterpillar, Inc.

Cemex (Mexico)

Cencora

CIBC (Canada)

Cochlear Limited (Australia)

Commonwealth Superannuation Corp. (Australia)

Cuscal Limited (Australia)

CVS Health

Dawn Foods

DBS Bank Ltd. (Singapore)

Doosan Corporation (Korea)

Ericsson (Sweden)

Fidelity Investments

Fomento Economico Mexicano, S.A.B., de C.V.

Genentech

Gilbane Building Co.

International Motors

Jewelers Mutual

Kaiser Permanente

Keurig Dr Pepper

King & Wood Mallesons (Australia)

Mater Private Hospital (Ireland)

Nasdaq, Inc.

Nomura Holdings, Inc. (Japan)

Nomura Research Institute, Ltd. Systems Consulting Division (Japan)

Novo Nordisk A/S (Denmark)

OCP Group

Pacific Life Insurance Company

Pentagon Federal Credit Union

Posten Bring AS (Norway)

Principal Life Insurance Company

QBE

Ramsay Health Care (Australia)

Reserve Bank of Australia

RTX

Saint-Gobain

Scentre Group Limited (Australia)

Schneider Electric Industries SAS (France)

Tabcorp Holdings (Australia)

Telstra Limited (Australia)

Terumo Corporation (Japan)

Tetra Pak (Sweden)

Truist Financial Corporation

UniSuper Management Pty Ltd (Australia)

Uniting (Australia)

USAA

Vanguard

WestRock Company

Wolters Kluwer Financial & Corporate Compliance

Xenco Medical

Zoetis Services LLC

MIT CISR Associate Members

MIT CISR wishes to thank all of our associate members for their support and contributions.

Welcome to the MIT CISR website!