Unfortunately the developer trained it on some CSAM which I think means they’re not free of guilt - we really need to rebuild these models from the ground up to be free of that taint.
Given it’s public dataset not owned or maintained by the developers of Stable Diffusion; I wouldn’t consider that their fault either.
I think it’s reasonable to expect a dataset like that should have had screening measures to prevent that kind of data being imported in the first place. It shouldn’t be on users (here meaning the devs of Stable Diffusion) of that data to ensure there’s no illegal content within the billions of images in a public dataset.
That’s a different story now that users have been informed of the content within this particular data, but I don’t think it should have been assumed to be their responsibility from the beginning.
Unfortunately the developer trained it on some CSAM which I think means they’re not free of guilt - we really need to rebuild these models from the ground up to be free of that taint.
Reading that article:
Given it’s public dataset not owned or maintained by the developers of Stable Diffusion; I wouldn’t consider that their fault either.
I think it’s reasonable to expect a dataset like that should have had screening measures to prevent that kind of data being imported in the first place. It shouldn’t be on users (here meaning the devs of Stable Diffusion) of that data to ensure there’s no illegal content within the billions of images in a public dataset.
That’s a different story now that users have been informed of the content within this particular data, but I don’t think it should have been assumed to be their responsibility from the beginning.