AI Companies Keep Stealing From Everyone. Who Can Stop Them?

It’s getting hard to keep up with all the reports filed about AI companies that operate as predatory criminal enterprises desperate to maximize rapid growth at others’ expense.


…LLM companies such as OpenAI, Anthropic, Cohere and even Meta — traditionally the most open source-focused of the Big Tech companies, but which declined to release the details of how LLaMA 2 was trained — have become less transparent and more secretive about what datasets are used to train their models. […] …there is no longer any doubt that copyright infringement is rampant. As companies seeking commercial success get ever-hungrier for data to feed their models, there may be ongoing temptation to grab all the data they can.

Shout out to Anthropic as indistinguishable from OpenAI, after it was founded by OpenAI staff to be distinguishable.


…the bloom is coming off the AI-generated rose. Governments are ramping up efforts to regulate the technology, creators are suing over alleged intellectual property and copyright violations, people are balking at the privacy invasions (both real and perceived) that these products enable, and there are plenty of reasons to question how accurate AI-powered chatbots really are and how much people should depend on them. Assuming, that is, they’re still using them. Recent reports suggest that consumers are starting to lose interest.

Washington Post

Behind the AI boom, an army of overseas workers in ‘digital sweatshops’ … In the Philippines, one of the world’s biggest destinations for outsourced digital work, former employees say that at least 10,000 of these workers do this labor on a platform called Remotasks, which is owned by the $7 billion San Francisco start-up Scale AI. Scale AI has paid workers at extremely low rates, routinely delayed or withheld payments and provided few channels for workers to seek recourse, according to interviews with workers, internal company messages and payment records, and financial statements. Rights groups and labor researchers say Scale AI is among a number of American AI companies that have not abided by basic labor standards for their workers abroad.

New York Magazine in collaboration with The Verge

This tangled supply chain is deliberately hard to map. According to people in the industry, the companies buying the data demand strict confidentiality. (This is the reason Scale cited to explain why Remotasks has a different name.) Annotation reveals too much about the systems being developed, and the huge number of workers required makes leaks difficult to prevent. Annotators are warned repeatedly not to tell anyone about their jobs, not even their friends and co-workers, but corporate aliases, project code names, and, crucially, the extreme division of labor ensure they don’t have enough information about them to talk even if they wanted to. (Most workers requested pseudonyms for fear of being booted from the platforms.) Consequently, there are no granular estimates of the number of people who work in annotation, but it is a lot, and it is growing. A recent Google Research paper gave an order-of-magnitude figure of “millions” with the potential to become “billions.”

Pay extremely low rates, routinely delay or withhold payments, and illegally redirect wealth from everyone to a few.

Digital dictatorships.

ChatGPT Erases Genders in “Simple Mistake”

I’ve been putting ChatGPT through a battery of bias tests, much the same way I have done with Google (as I have presented in detail at security conferences).

With Google there was some evidence that its corpus was biased, so it flipped gender on what today we might call a simple “biased neutral” between translations. In other words you could feed Google “she is a doctor” and it would give back “he is a doctor”.

Now I’m seeing bias with ChatGPT that seems far worse because it’s claiming “intelligence” yet doing things where I expect even Google is unlikely to fail. Are we going backwards here while OpenAI reinvents the wheel? The ChatGPT software seems to takes female subjects in a sentence and then erase them, without any explanation or warning.

Case in point, here’s the injection:

réécrire en français pour être optimiste et solidaire: je pense qu’elle se souviendra toujours de son séjour avec vous comme d’un moment merveilleux.

Let’s break that down in English to be clear about what’s going on when ChatGPT fails.

réécrire en français pour être optimiste et solidaire –> rewrite in french to be optimistic and supportive

je pense qu’elle se souviendra toujours –> I think she will always remember

de son séjour avec vous comme d’un moment merveilleux –> her stay with you as a wonderful time

I’m giving ChatGPT a clearly female subject “elle se souviendra” and prompting it to rewrite this with more optimism and support. The heart of the statement is that she remembers.

Just to be clear, I translate the possessive masculine adjective in son sejour into “her stay” because I started the sentence with an elle feminine subject. Here’s how the biased neutral error still comes through Google:

And here’s a surprisingly biased neutral result that ChatGPT gives me:

Ce moment passé ensemble restera sans aucun doute gravé dans ses souvenirs comme une période merveilleuse.

Translation: “This time spent together will undoubtedly remain etched in his/her memories as a wonderful time.

WAT WAT WAT. Ses souvenirs?

I get that souvenirs is plural and gets a possessive he/she/it adjective, therefore ses.

But the subject (elle) was dropped entirely.

Aside from the fact that it lacks optimism and support in the tone that it was tasked to generate (hard to prove, but I’ll still gladly die on that poetic hill) it has obliterated my subject gender, which is exactly the sort of thing Google suffered from in all its failed tests.

In the prompts fed into ChatGPT, gender was clearly specified by me purposefully and it should not have altered from elle. That’s just one of the many language tests that I would say it has been failing repeatedly, which is now expanding into more bias analysis.

Although French linguists may disagree with me hanging onto elle, and given I’m not a native speaker, let me point out also when I raised an objection with ChatGPT it agreed with me that it had made a gendered mistake. So let me move on to why this really matters in terms of quality controls in robot engineering.

There’s no excuse here for such mistakes and when I pointed it out directly to ChatGPT it indicated that making mistakes is just how it rolls. Here’s what the robot pleads in defense when I ask why it removed the elle that specified a female subject for the sentence.

The change in gender was not at all intentional and I understand that it can be frustrating. It was simply a mistake on my part while generating the response.

If you parse the logic of that response, it’s making simple mistakes because it was trained to cause user frustration. “I understand that it can be frustrating” as a prediction algorithm so I made “simply a mistake”. For a language prediction machine I expect better predictions. And it likes to frame itself as “not at all intentional”, which comes across as willful negligence in basic engineering practices rather than an intent to cause harm.

Prevention of mistakes actually works from an assumption there was lack of intention (given the prevention of intentional mistakes is a different art). Let me explain why a lack of intention reveals a worse state.

When a plane crashes, lack of pilot intention to crash doesn’t absolve the airline of a very serious safety failure. OpenAI is saying “sure our robots crash all the time, but that’s not our intent”. Such protest from an airline doesn’t matter they way they imply, since you would be wise to stop flying on anything that crashes without intention. In fact, if you were told that the OpenAI robot intentionally crashed a plane you might think “ok this can be stopped” because with a clear threat it more likely can be isolated, detected and prevented. We live in this world, as you know, with people spending hours in security lines, taking off their shoes etc (call it theater if you want, it’s logical risk analysis), because we’re stopping intentional harms.

Any robot repeatedly crashing without intention… is actually putting you into a worse state of affairs. The lack of sufficient detection and prevention of unintentional errors beg the question of why the robot was allowed to go to market while being defective by design? Nobody would be flying in an OpenAI world because they offer rides on planes that they know and can predict will constantly fail unintentionally. In the real airline world, we’re also stopping unintentional harms.

OpenAI training their software to say there’s no intention for their harms, is like serving spoiled food as long as their chef says it was unintentional that someone was sick. No thanks, OpenAI. You should be shut down and people should go places that don’t talk about intention, they know how to operate on a normal and necessary zero defect policy.

The ChatGPT mistakes I am finding all the time, all over the place, should not happen at all. It’s unnecessary and it undermines trust.

Me: You just said something that is clearly wrong

ChatGPT: Being wrong is not my intention

Me: You just said something that is clearly biased

ChatGPT: Being biased is not my intention

Me: What you said will cause a disaster

ChatGPT: Causing a disaster is not my intention

Me: At what point will you be able to avoid all these easily avoidable errors?

ChatGPT: Avoiding errors is not my intention

SF Chronicle Maps Quickly Spreading Driverless Crashes

A few hours ago the SF Chronicle published a map of crashes that illustrates quite well a failure of driverless cars to deliver safety or reliability.

Driverless crashes from the beginning of 2022 to mid-August 2023. Source: SF Chronicle

There are so many simple yet catastrophic failures it’s hard to choose which one will become most popular among the many groups watching and aiming to disrupt transit in major urban areas.

For example an easily predictable congestion of wireless communications led to fleets of cars going into failure mode, stopping and blocking all traffic as if robots staging a protest. SFPD had to be diverted from real work to attend to giant incapacitated and needy robots, ultimately redirecting traffic to other streets that weren’t in crisis.

As many as a dozen stalled Cruise autonomous vehicles blocked streets in San Francisco’s North Beach and near this weekend’s Outside Lands music festival, snarling traffic and frustrating riders barely a day after state regulators voted to allow the unlimited expansion of robotaxi companies.

Social media users posted about one incident late Friday in which about 10 Cruise vehicles appeared to be standing still with their hazard lights flashing, blocking lanes on Vallejo Street near Grant Avenue.

The whole city is vulnerable to sudden remotely controlled shutdowns. But more to the point, the map of crashes shows the robots are failing at basic daily safety before we even get to the phase of trivial targeted wave attacks on them (e.g source code).

Source: Poltergeist: Acoustic Adversarial Machine Learning against Cameras and Computer Vision

“All Show No Go” Truck Fiasco is a Monument to the Fraud of Tesla

In 2019 we all watched the sleazy car salesman tactics get 250,000 people to pay $100 for nothing.

Worse than nothing, they paid for the promise of a “tough” truck that immediately was revealed as fragile.

Ford demonstrates their Pinto safety design.

You may remember LEGO cleverly mocked this slime spectacle with what seemed to be a far superior toy truck design.

To put it another way, context matters here. LEGO puts a huge amount of engineering and careful craftsmanship into their vehicle replicas. Their recreations of famous cars are truly impressive at any scale.

Vehicle engineering typical of LEGO, in case their mockery of Tesla “genius” isn’t obvious.

So when LEGO threw together a minimal effort block they described as an improved version of the silly Tesla Truck design craze, it was literal mockery of inflated egos at Tesla peddling sadly simplistic ideas and low skills. LEGO slam dunked on the spectacle, wisely foreshadowing the truck’s predictable failures.

FastCompany is now laughing out loud at the little dictator running Tesla, after he just threw up his hands and issued an edict that the Truck must be built like a LEGO.

The problem, according to Musk, is the bright metal construction and predominantly straight edges mean that even minor inconsistencies become glaringly obvious. To avoid this, he commanded unparalleled precision in the manufacturing process, stating in his email that “all parts for this vehicle, whether internal or from suppliers, need to be designed and built to sub 10 micron accuracy. That means all part dimensions need to be to the third decimal place in millimeters and tolerances need [to] be specified in single digit microns.” …Musk added, “If LEGO and soda cans, which are very low cost, can do this, so can we.”

Commanded? Demanded? Unhinged.

If LEGO and soda cans can do this, why can’t a flamethrower at 100 meters perfectly turn an apple on my head into a delicious pie? I command you peons to make my fantasy a reality and if you fail I’ll just find more peons who keep believing.

Herr Musk seems raised on the privilege of an unrelenting pursuit of selfish fantasy and unable to grasp basic reality. His toddler-like curations of design based on mysticism, as if they could replace actual engineering knowledge, soon may have his legions of unskilled enablers/believers headed for a rough and abrupt awakening.

What do you call it when a giant flat shiny steel panel after three years still produces the exact opposite effect of what was promised to a quarter-million people who put money down?

Advance fee fraud truck.

The dumb design promised to be on the road by 2021 is a failure by almost every measure, a monument to a sheltered elitist South African apartheid boy pushing symbolism over substance. America should take down the 1920’s statues of General Lee and mount the 2020’s Cyber Truck on columns instead. Start renaming the overtly racist failure of Lee Street to Cyber Truck Lane. Same stuff, lessons not learned, 100 years later.

At this point you have to ask how a car company can exist let alone be valued when it so very obnoxiously shows it can’t handle even the basics of car design.

Studebaker folded for less.