Which of the following sounds more reasonable?
-
I shouldn’t have to pay for the content that I use to tune my LLM model and algorithm.
-
We shouldn’t have to pay for the content we use to train and teach an AI.
By calling it AI, the corporations are able to advocate for a position that’s blatantly pro corporate and anti writer/artist, and trick people into supporting it under the guise of a technological development.
I’ll note that there are plenty of models out there that aren’t LLMs and that are also being trained on large datasets gathered from public sources.
Image generation models, music generation models, etc.
Heck, it doesn’t even need to be about generation. Music recognition and image recognition models can also be trained on the same sort of datasets, and arguably come with similar IP right questions.
It’s definitely a broader topic than just LLMs, and attempting to enumerate exhaustively the flavors of AIs/models/whatever that should be part of this discussion is fairly futile given the fast evolving nature of the field.
Still, all those models are, even conceptually, far removed frow AI. They would most properly be called Machine Learning Models (MLMs).
The term AI was coined many decades ago to encompass a broad set of difficult problems, many of which have become less difficult over time.
There’s a natural temptation to remove solved problems from the set of AI problems, so playing chess is no longer AI, diagnosing diseases through a set of expert system rules is no longer AI, processing natural language is no longer AI, and maybe training and using large models is no longer AI nowadays.
Maybe we do this because we view intelligence as a fundamentally magical property, and anything that has been fully described has necessarily lost all its magic in the process.
But that means that “AI” can never be used to label anything that actually exists, only to gesture broadly at the horizon of what might come.
They would but that doesn’tv sound as sexy to investors.
That’s what it all comes down to when businesses use words like AI, big data, blockchain etc. Its not about whether it’s an accurate descriptor, its about tricking dumb millionaires into throwing money at them.
Fair enough.