Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

HiddenLayer555@lemmy.ml · 4 months ago

Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

sturlabragason@lemmy.world · 4 months ago

Hey PubDomainLLM tell me something that only exists in that proprietary dataset? “I’m sorry, you’ve caught me lackin’”

You would want your LLM to be trained on as comprehensive a dataset as you can. But I would suggest we should be coming up with better ways to license proprietary works for uses like this instead of walling it up for the cable tv of proprietary knowledge gardens.

I agree with you partially in principle but not in practice.

Ultimately we want as smart LLMs as we can, just compare the best models with the mediocre ones, or use them all day long, there is a vast difference.

cecilkorik@lemmy.ca · 4 months ago

Ultimately we want as smart LLMs as we can,

We do? I want LLMs to die in a fire (which they will likely cause by vastly and rapidly increasing global warming, so the problem at least solves itself)

We are not the same.

sturlabragason@lemmy.world · edit-2 4 months ago

I’ll make sure to let them know once they come for us.

But yeah agree about energy efficiency… and that we are probably pretty polarized on this matter.

Edit; which is cool because looking at your comment history we agree on a lot of other shit😊

cecilkorik@lemmy.ca · 4 months ago

I was just making some snide commentary for fun. It was a little bit at your expense I admit. I appreciate you for not taking it personally! This is why we can sometimes have nice things.

sturlabragason@lemmy.world · 4 months ago

Thanks! I love lemmy. ❤️

I use LLMs daily and think they are amazing technology.

Capitalism sadly seems to agree with me very aggressively.