Revolution of computer-generated creation

Revolution of computer-generated creation

Revolution of computer-generated creation

Artificially generated images and videos take over the communication sector. It is essential that we use this technology in an ethically responsible way, argues Jens Krahe.

 

The revolution of artificially generated images has gone beyond the point of no return: already, we can all create artificial images, thanks to the freely accessible image generators in DALL-E 2 and Midjourney, or through Stable Diffusion as an open source solution. The basis for this is what is known as a ‘prompt’: an English text description that can be elaborated upon through sketches. It can be assumed that computer-generated creations will be part of everyday work in agencies and companies within three years at the latest. This fascinating technology transcends all the limitations of visual creativity. Not only is the process extremely fast, but also its development will be rapid. At the same time, it spells an end to laboriously searching though image databases to illustrate PowerPoint presentations or visualise creative ideas and campaigns in Photoshop.

 

Interpretation of the terms will define the image

 

It sounds so simple, entering text to generate an image. However, in practice, it is not quite as easy as that, because the prompt must function so that the technology behind it ‘understands’ and consequently generates the required image. Adding rough sketches or images can actively affect image generation and have a positive impact on image quality and the creative outcome. Image creators must also know which text inputs and drawings produce the best results. It becomes especially difficult when searching for terms to illustrate abstract brand values such as trust, safety, or purpose. This is why there are portals like ‘PromptHero’ that provide prompts free of charge to share their text experience with others, or platforms like ‘PromptBase’, which is run as a commercial venture.

In addition, a prompt in DALL-E will not necessarily give the same result in Midjourney. This is because the resulting image does not solely depend on the word count, but also on the interpretation of the term and on the underlying model. So, prompt specialists need to be familiar with as many different systems as possible.

Since prompts can also relate to video creation, it is not only effective and descriptive key words that are required, but also suitably drafted storyboards. This is another instance where sketches can improve quality. In the future, this technology may be able to generate entire 3D worlds and thereby become part of the Metaverse. The experience would be comparable with that of a holodeck, which is loaded and generated according to the description in a ‘prompt’. Here, the correct term inputs are likewise necessary. 

 

Downsides and dangers of the technology

 

However, the infinite world of computer-generated images and videos also has its dark side. For instance, the copyright and rights of use for the pre-existing image motifs and film footage that systems appropriate to create new material have yet to be clarified. In addition, copyrights for creations remain floating in a grey area. To date, it is only DALL-E that has stipulated that image rights belong to the creator. Moreover, image generators are only as good as their training: the learning process is based solely on image or 3D data. Therefore, the fear is that the artistic motifs and video clips will be biased, for example, they could be socially prejudiced, exclude specific population groups, or depict those groups in a false light. Therefore, companies and agencies have a duty to collect their own unbiased, copyright-free data and use them to train the model. These trained models are known as 'diffusers’ or ‘vegans’.

There is enormous potential in the new technology and  these opportunities must not be cast aside. So, the technology needs to be managed responsibly, with the right skills in the team, so that computer-aided communication remains attractive, credible, and diverse. It will revolutionise our outlook and bring with it efficiency gains, just as the introduction of computers did in terms of digitalised publishing, image processing, and typography.

 

Author: Jens Krahe, Managing Partner at Plan.Net Cologne   

Interested in more content?

Back to issue #9