Is Lensa AI Stealing From Human Art? An Expert Explains The Controversy – ScienceAlert

Is Lensa AI Stealing From Human Art? An Expert Explains The Controversy – ScienceAlert

The Lensa photo and video editing app has shot into social media prominence in recent weeks, after adding a feature that lets you generate stunning digital portraits of yourself in contemporary art styles.

It does that for just a small fee and the effort of uploading 10 to 20 different photographs of yourself.

2022 has been the year text-to-media AI technology left the labs and started colonizing our visual culture, and Lensa may be the slickest commercial application of that technology to date.

It has lit a fire among social media influencers looking to stand out โ€“ and a different kind of fire among the art community. Australian artist Kim Leutwyler told The Guardian she recognized the styles of particular artists โ€“ including her own style โ€“ in Lensa’s portraits.

Since Midjourney, OpenAI’s Dall-E, and the CompVis group’s Stable Diffusion burst onto the scene earlier this year, the ease with which individual artists’ styles can be emulated has sounded warning bells.

Artists feel their intellectual property โ€“ and perhaps a bit of their soul โ€“ has been compromised. But has it?

Well, not as far as existing copyright law sees it.

If it’s not direct theft, what is it?

Text-to-media AI is inherently very complicated, but it is possible for us non-computer-scientists to understand conceptually.

To really grasp the positives and negatives of Lensa, it’s worth taking a couple of steps back to understand how artists’ individual styles can find their way into, and out of, the black boxes that power systems like Lensa.

Lensa is essentially a streamlined and customized front-end for the freely available Stable Diffusion deep learning model. It’s so named because it uses a system called latent diffusion to power its creative output.

The word “latent” is key here. In data science, a latent variable is a quality that can’t be measured directly, but can be inferred from things that can be measured.

When Stable Diffusion was being built, machine-learning algorithms were fed a large number of image-text pairs, and they taught themselves billions of different ways these images and captions could be connected.

This formed a complex knowledge base, none of which is directly intelligible to humans. We might see “modernism” or “thick ink” in its outputs, but Stable Diffusion sees a universe of numbers and connections.

And all …….