The Winter of AI is Here (What to do About it?)

Well, we’re here. AI Winter. The videos no longer look like hot garbage or uncanny valley monsters coming for you. So what now? How do you navigate this new world?

Approaching From the Wrong Angle

The conversation of ‘AI looks bad, and it won’t ever fool anyone’ was always doomed. Text-based AI programs like ChatGPT are fundamentally flawed even as they sound more and more humanlike; being trained on content available on the open web means that poor-quality information, rag fiction, and conspiracy theorems are all competing with actual truth, objective reality, and unfortunately in many cases the objective reality is not appealing to the algorithm. However, for image-based generative AI, the tech bros were correct (for once) in saying ‘this is the worst it will ever look’ every time it generated a picture of someone with eight fingers. The image and video stuff is getting to a point where someone casually scrolling may not notice, even as text-based generative AI is being left in the dust. An AI video fooled me, and I thought I’d always be able to tell!

So, the videos and pictures are frighteningly good. A new app called Sora has actually made an entire platform out of generating videos, and while plenty are innocuous, plenty more are being used to push narratives out to people who don’t realize just how good these videos have gotten. Sora’s watermark on its generated content is currently so small and easy to remove that it’s barely a deterrent from sharing videos as though they were real at all, and someone determined enough can fully yoink it from the video with a relatively minimal amount of tweaking. If someone’s looking to start problems, that’s plenty worth the effort.

Before the current trend of small children and the elderly being blown up by dogs on their front porch (always a porch, because Amazon Ring camera footage is a huge source of video) there was a brief rash of videos generated due to the SNAP program being frozen after a government deadlock lead to a shutdown. These videos were deeply mean in spirit, and more importantly, reflected a reality that confirmed the biases of people who don’t agree with their political system – and if they can’t tell the video is fake, then they’re going to go into their next local election believing the things they saw are reality, and vote against the best interests of their neighbors and themselves because of deliberate disinformation.

The videos from before are gone. The new generation is here. We are not ready for it!

How Did We Get Here?

Pictures, which are one still image, were always relatively easy to doctor – even when we were working off of negatives, there were ways to tamper with the photo. Painting the negative, double-exposures, taking a photo of a photo with other elements of other photos cut out and laid on top – truly, we have been altering photos since photos were there to alter. Ergo, it’s gotten a lot harder to fool someone with something that looks improbable: the average person knows about the existence of Photoshop.

Videos were always harder. To ‘correct’ a video like a photo would mean painting over dozens of negatives in the reel per second, and for however long some element needed to be removed from the footage. CGI has only recently stopped looking plastic, and even then it takes a really competent person (or team) to make movie monsters look so real they look lifelike.

Ergo, videos were pretty reliable. The bigger problem with video up until this point was soft-faking, where a video’s original context is not shown to the audience. In small cases, this can look like someone screaming at an employee, only to pull out their phone and begin filming once the employee has had enough and snaps back, making the filmer look like the victim; in bigger cases, this can look like filming unrelated events like the riots that sometimes break out after major sports losses and then portraying these as average days in a given city, with the intent to paint these places as dark, lawless hellholes that nobody sensible would ever enjoy living in, which of course then justifies funneling another XX million dollars away from social programs directly into the hands of big tech CEOs to monitor citizens without their knowledge. Soft-filtering the context makes it possible to lie with the truth!

Bizarrely, because low-quality door cameras like Ring are so common now, it’s easy to trick people with poor-quality AI videos because each incorrect element may only be present for 1/30th of a second, and movement disguises poorly generated elements the same way old TV monsters, via a combination of CRT and interesting movement, could look real.

Blurry, faceless ‘people’ in the background aren’t blurry and faceless because it’s AI, they seem to be blurry and faceless because the doorbell camera just doesn’t have enough detail. Odd motion lines and weirdly textured fur on animals, same thing.

The same goes for what are supposed to be smartphone videos – if people look weirdly airbrushed and everything looks like motion blur is on, well, maybe they are, because phones come with beauty filters by default.

Right now, if it’s good enough to correctly align facial features and keep them there on whoever’s in the middle of the screen, it’s good enough to pass for real until someone makes a video debunking it by pointing out someone in the background seems to morph so their head is on backwards without turning around.

Dealing With It

Winter is here. There is no ‘dealing with it’. The reality of this situation would require people to totally re-wire how they consume social media, and whether that’s even possible or not, it’s not in the best interests of the tech giants that control said social media.

Politically motivated clips faking the “reality” of who receives SNAP benefits generate incredible amounts of outrage: either the people who hate SNAP (and the people who qualify) are simply astonished their tax money is going to exactly the stereotype they picture, or it’s the people who can tell it’s AI shouting desperately in the comments that the video is fake, and praying that people can hear them before they end up believing something so intensely mean and untrue about their fellow human beings.

This is an outrage goldmine, the kind of thing that people used to have to hire actors for; now, if someone wants to create discord and political dissent, they can just say “make me a video where a demographic I don’t like looks stupid!” And generally speaking, AI programs will do it, and then tech giants won’t remove it. Or if they do, they’ll wait until it has hundreds of thousands of views and has been reshared to dozens of different platforms. The rage is awesome for views and impressions. It sells ad space and makes the website owners a lot of money. Your wellbeing and your relationship to others barely comes into the equation at all.

Your parents, if they’re the type to spend hours scrolling through online forums and Facebook reels, might never realize AI videos have been slipped to them at all, and come away a meaner, angrier person because of it. Individually assessing videos for them requires them to admit a video concerned them to show it to you, and if the video reinforces their beliefs, why would it concern them? Even seemingly innocuous videos where cats and dogs blast people with water after they open their front doors are bizarre and unsettling, and if they aren’t aware the AI has gotten this good, the ones where the animal blows up like a claymore might actually frighten them into thinking this is a real threat created by real terrorists. You then must convince them that this thing they had an emotional reaction to is actually faked.

The average person doesn’t like feeling stupid, and the average person’s literacy is sinking, meaning their ability to think critically is likely sinking with it. If they realize a video was faked by themselves, that’s great, but they likely won’t enjoy being told something fooled them after the fact, especially if they invested any anger into it when they believed it. Even ‘tech’ generations like Gen Z and Gen A are falling for these videos, even if they know the technology exists, because the way they were raised didn’t account for perfect nonsense machines showing up specifically to trick them! Obvious scams like cryptocurrencies where rugpulls are basically an inevitability at this point have found a new generation to take host in, and the state of tech today is terrifying, with few people trying to stop it and dozens of companies who stand to gain from your confusion and anger dominating the market. This is the digital winter, this is what the fights should have been about – reality can no longer reliably be found online.