When fun is in the eye of the AI beholder: Hamblen

By Matt Hamblen Feb 18, 2021 1:45pm

big data AI Apple edge inference co-processor

Artificial intelligence is now engrained in so many operations that it is hard to keep track. It has become something we take for granted.

We get recommendations on things to buy, things to do and places to visit from web sites. AI is helping lenders make decisions on loans and even prompting innovations as varied as deep medical research and dog training.

Recently, AI affected me directly within a short movie created automatically on my iPhone that used old pictures and videos I had shot in Washington. One morning this week I viewed a short video montage that suddenly appeared on my phone of a short 2017 trip to the nation’s capital. Along with relatives back then, I had toured the then-new Museum of African American History and Culture where I took a few shots inside and outside of nearby monuments.

What showed up in the short montage were several of my inside museum shots, mainly of tragic examples of slavery and racism. One stark image is a life-size sculpture of founder Thomas Jefferson next to a stack of bricks that represent the more than 600 slaves he owned. The outside shots include one of the nearby Jefferson Memorial.

The movie was indeed a surprising reminder of our museum visit, but it was a bit jarring to hear it set to music. Even more unsettling was the title the AI ascribed to the 23-second movie: “Fun in Washington/May 4, 2017.”

Fun? What was fun about it? The trip to the museum that day had some of the most sobering displays I have ever felt about lynchings, imprisonments, Jim Crow and tortures.

Admittedly, the visit had also produced some beautiful, poignant moments: seeing small children learning history from their parents; enjoying ethnic food in the cafeteria; talking to the friendly museum store manager about how our visit prompted me to reflect on growing up in Kansas City, Missouri, including the riots after Martin Luther King Jr.’s assassination in 1968. None of those poignant moments were ever part of the images I had taken on my phone and obviously were not part of the short movie, however.

Yes, in one sense, it had been a fun day trip, but the pictures in the movie were mostly pointing to pain, I thought, as I viewed the movie. A human headline writer would never call it a “fun” trip. AI tools are being applied today to writing headlines and entire news stories, and I wonder how they would describe such a visit.

All this made me recall how AI makes decisions on what images to pick for a movie and what titles and music to apply. In a vastly oversimplified explanation, AI decisions are governed by the way an AI application is trained and the inferences it makes. During the training phase, enormously large data sets are introduced. In the case of images, it could be many millions of depictions of faces or trees or buildings. To train a dog to sit in one real-world example recently mentioned in Fierce Electronics, researchers found thousands of images of dogs of all varieties sitting in different lighting and settings.

During the inference phase in this dog training work done by Colorado State University researchers, a powerful processor and image sensor compared a dog commanded by his owner to sit in the real world in the present to previously stored images to confirm, “Yes, this dog right here in front of us is actually sitting.” In that real-world example, if the dog were indeed confirmed to be sitting as commanded and not lying down or standing, a machine would be activated to automatically feed the dog a treat.

To prepare the short Washington movie on my phone, Apple’s AI apparently has stored images of many, many objects and faces in the cloud and that probably includes plenty of iconic buildings from Washington such as the Washington Monument and the Jefferson Memorial, both which showed up in my movie. But my iPhone also has a timestamp for my photos of the day and geolocation as a reference for the AI. I wondered, would an AI algorithm compare my obvious images of Washington to its own images and infer that a visit to Washington, with pictures taken, would be “fun”? Or better: Are most trips away from home judged by most people to be fun? Or maybe the algorithm picked up something in the faces in some of the museum portraits included in the movie that seemed to indicate fun.

To that end, I scanned all the images again to find smiles on the faces but did not see any. A woman drinking tea in a portrait wears a bright red shirt and yellow scarf. That image would be seen by some people as interesting and possibly fun, in a fashion vernacular, but who knows? Ultimately, Apple’s AI may be comparing most any short trip to Washington as something that is fun, referencing what thousands of middle schoolers do every year, many yelling with glee as they pour out of buses onto the national mall.

To be clear, I am blown away by what Apple has done with these short movies, bringing up memories from my photo store, then piecing them together and finding music. I am even given the ability to customize them, by making my movie longer or shorter before sharing them. I can usually choose music from several categories: dreamy, sentimental, gentle, chill, happy, uplifting, epic, club or extreme. In this case I stuck with the default that the AI produced. (I switched during one playback to club music, which didn't last long.)

All told, these short movies are a tremendous feature on my phone even if the AI missed this time by calling my Washington experience “fun.”

Still, it made me wonder if there could be some bias in the data set. Bias is what many people worry about with AI. Engineers often defend AI and say the engineers are probably not biased, but their data sets could be biased so they spend time to find a high quality, standard data set or create their own. (John Deere has spent years taking pictures of weeds in crops to train its AI for a precise weed control spraying application.)

Data sets need to be enormous in size, and reflective of the entire world present in an AI application, but how is Apple or any company able to consider all the situations that a phone user might encounter to compare against? Does Apple have a good store of images related to slavery or racism? Perhaps, but I do wonder.

One obvious concern here is when AI is deployed in future autonomous vehicles. Will it be adequately trained on sufficient data sets free of bias? In every inference, every split-second decision, a vehicle must know if the approaching object is a threat or not. It seems like an almost insurmountable and scary calculation to me but one (I am told) within the grasp of smart data scientists and design engineers.

Not to be impertinent, but I must ask: Might this serious endeavor ever be deemed fun?

Matt Hamblen is Editor-in-Chief of Fierce Electronics.

big data AI Apple edge inference co-processor iphone Embedded Sensors Analog IoT & Wireless Power Management