Machine learning systems need informed (i.e., human) oversight
UPDATED: September 29, 2021
Don鈥檛 be surprised if your family doctor pulls out her cell phone to diagnose that unusual bump on your skin. She鈥檚 using an AI-assisted system, based on machine learning (ML) technology, to determine if the bump is potentially troublesome and needs a specialist鈥檚 eye. But what if the software itself is biased, because the data it trained on was biased improperly toward one race or another?
In new research, Professors Mike H.M. Teodorescu and Gerald C. Kane of the md传媒国产剧 College Carroll School of Management, along with Lily Morse of West Virginia University and Yazeed Awwad of MIT, propose that some measure of human oversight is necessary for any ML-reliant decision-making process. They refer to such oversight as 鈥渁ugmentation.鈥
In machine learning, networks of algorithms 鈥渓earn鈥 by looking for patterns in immense amounts of data. But if the training data is skewed鈥攊n the dermatology system, for instance, toward whites rather than Blacks, Latinos, or Asians鈥攖he output could be faulty. The researchers offer a matrix to guide organizations in instituting appropriate human augmentation to ensure that the ML systems they rely on are as fair and unbiased as possible.
Their article, 鈥,鈥 appeared in of MIS Quarterly聽as part of a special issue on 鈥淢anaging AI.鈥澛營n it, the authors propose that organizations add humans to the equation according to where the final decision is made鈥攈uman or machine鈥攁nd the complexity of the fairness criteria underpinning the ML system. Different scenarios dictate at what stage and to what extent human oversight or intervention is advised.
In the case of, say, fintech companies that use ML models to verify identities for validating loan worthiness, only what the authors call 鈥渞eactive oversight鈥 is needed鈥攖hat is, where the ML model spits out an outcome that is examined by the human decision maker. If the model doesn鈥檛 routinely produce results that, post facto, are deemed in compliance with bias regulations, managers should then provide clearer fairness objectives to their development teams.
When fairness is more complicated and ML systems still make the final decision鈥攆or instance, in content filtering for online social media platforms鈥攎ore human augmentation is called for. That is because the filters themselves could reflect their developers鈥 opinions, creating 鈥渋deological echo chambers.鈥 Such models could even be manipulated by savvy social media users who flood the platform with their own views. In such a case, the authors suggest 鈥減roactive oversight鈥 at an earlier stage: humans should manually vet feedback and guide the models 鈥渢oward the pathways that come closest to meeting agreed-upon standards of right and wrong.鈥
In cases where final decisions rest with humans and fairness criteria are relatively simple, such as with the AI-assisted dermatology exam, the ML model provides "decision support." Educating the human decision-makers about how the software may be systematically unfair is sufficient in this situation.
Sometimes, however, recognizing that an ML system might be skewing outcomes but determining exactly how is impossible. An example is a platform that tracks job candidates鈥 behavior in video interviews to assess their employability. For all the hiring client knows, the system could simply be basing its calculation on past decisions鈥攊f a client company comprises mostly males, the model is more likely to recommend men. In these cases, either managers or even other automated systems should double-check the system鈥檚 recommendations. 聽
Understanding the nuances of ML technology is critical, the authors conclude: 鈥淎 robust research agenda regarding fairness and augmentation can help organizations more effectively leverage ML models鈥 benefits while limiting the potential adverse societal effects.鈥
Marilyn Harris is a reporter, writer, and editor with expertise in translating complex or technical material for online, print, and television audiences.