Facebook is sharing a new and diverse dataset with the wider AI community. In an announcement spotted by VentureBeat, the company says it envisions researchers using the collection, dubbed Casual Conversations, to test their machine learning models for bias. The dataset includes 3,011 people across 45,186 videos and gets its name from the fact it features those individuals providing unscripted answers to the company’s questions.
What’s significant about Casual Conversations is that it involves paid actors who Facebook explicitly asked to share their age and gender. The company also hired trained professionals to label ambient lighting and the skin tones of those involved according to the Fitzpatrick scale, a dermatologist-developed system for classifying human skin colors. Facebook claims the dataset is the first of its kind.
You don’t have to look far to find examples of bias in artificial intelligence. One recent study
While Facebook describes Casual Conversations as a “good, bold first step forward,” it admits the dataset isn’t perfect. To start, it only includes people from the United States. The company also didn’t ask participants to identify their origins, and when it came to gender, the only options they had were “male,” “female” and “other.” However, over the next year, it plans to make the dataset more inclusive.