Artificial intelligence and data have become watchwords in the business and technology worlds. The related concepts were highlighted at the EmTech Digital 2022 conference, held virtually and in person in Boston today, in a series of discussions under the rubric “Better Data, Better AI.”
Moderated by Will Douglas Heaven, senior AI editor at MIT Technology Review, the discussions spanned notions of data-centric AI and AI ethics and offered a primer on issues to watch for as organizations strive to generate practical outcomes from what has been a largely theoretical undertaking.
AI luminary Andrew Ng suggested that organizations need data-centric AI to avoid the “junk food” problem in the pursuit of artificial intelligence. “Data feeds AI,” Ng said, adding that we need the right data to provide the right “nutrients,” as opposed to feeding AI with either tomes of bad data or not enough data in vertical use cases. He argued that the uncertainties inherent in human-led efforts like data-labeling and tagging can be adjusted and improved with a data-centric AI approach.
AI Ethics Too Often an Afterthought
In the session, “Thought Leadership in Ethical AI,” DataRobot global AI ethicist Haniyeh Mahmoudian tackled the question of human involvement in AI.
Mahmoudian made a compelling case for the need to examine all assumptions and proxies used to define the building blocks of AI. She pointed to hidden discrimination in healthcare, citing an example of a healthcare company that used “total historical cost” as a proxy for a person’s level of sickness in their AI model. This proxy neglected the fact that communities of immigrants and some racial minorities in the U.S. do not use the healthcare system as often as others, thereby incurring less cost and skewing the data. She demonstrated how this led to different healthcare outcomes for some communities, whereas the right questioning of the models on which the algorithm was built would have revealed its deficiency.
Not only are racial and gender discrimination concerns, so too are privacy, security, and compliance if AI models are not subjected to ethical scrutiny, Mahmoudian said.
Rajiv Shah, principal data scientist at Snorkel AI, spoke at length about the need for data-based labeling for a variety of practical outcomes -- from spam management to document interpretation. He suggested that humans will always produce different outcomes but that human inputs remain necessary in the process.
All the speakers touched on the sensitive question of the balance between humans and machines in the pursuit of practical, ethical AI. All agreed (though Mahmoudian was the most direct) that ethical considerations must be built in from the beginning, because even slight modifications in assumptions can create far-reaching implications down the road. Heaven said that if ethical questions aren’t handled appropriately, then “AI will never get wide adoption.”
The EmTech Digital speakers suggested that the technology has come a long way but there remains a long way to go in realizing the dream of helpful, universal AI. Examining AI honestly and ethically will help us bring AI to where it needs to be.