The biggest issue with generative AI, at least to me, is the fact that it’s trained using human-made works where the original authors didn’t consent to or even know that their work is being used to train the AI. Are there any initiatives to address this issue? I’m thinking something like an open source AI model and training data store that only has works that are public domain and highly permissive no-attribution licenses, as well as original works submitted by the open source community and explicitly licensed to allow AI training.

I guess the hard part is moderating the database and ensuring all works are licensed properly and people are actually submitting their own works, but does anything like this exist?

  • sturlabragason@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    4
    ·
    1 day ago

    Hey PubDomainLLM tell me something that only exists in that proprietary dataset? “I’m sorry, you’ve caught me lackin’”

    You would want your LLM to be trained on as comprehensive a dataset as you can. But I would suggest we should be coming up with better ways to license proprietary works for uses like this instead of walling it up for the cable tv of proprietary knowledge gardens.

    I agree with you partially in principle but not in practice.

    Ultimately we want as smart LLMs as we can, just compare the best models with the mediocre ones, or use them all day long, there is a vast difference.

    • cecilkorik@lemmy.ca
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      2
      ·
      1 day ago

      Ultimately we want as smart LLMs as we can,

      We do? I want LLMs to die in a fire (which they will likely cause by vastly and rapidly increasing global warming, so the problem at least solves itself)

      We are not the same.

      • sturlabragason@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        1 day ago

        I’ll make sure to let them know once they come for us.

        But yeah agree about energy efficiency… and that we are probably pretty polarized on this matter.

        Edit; which is cool because looking at your comment history we agree on a lot of other shit😊

        • cecilkorik@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          21 hours ago

          I was just making some snide commentary for fun. It was a little bit at your expense I admit. I appreciate you for not taking it personally! This is why we can sometimes have nice things.

          • sturlabragason@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            19 hours ago

            Thanks! I love lemmy. ❤️

            I use LLMs daily and think they are amazing technology.

            Capitalism sadly seems to agree with me very aggressively.