Recommend several AI artificial intelligence painting artifacts
Google:Imagen
Imagen official website address: https://imagen.research.google/
Just one month after Dall·E2 was released, Google announced its artificial intelligence system Imagen.
Imagen is a text-image diffusion (CLIP) model developed by the Google Research and Google Brain team. The slogan is "Unprecedented realism × Deep Language Understanding", which means that it can be generated based on given prompts. An image that closely matches the meaning of the text and has a photo-realistic feel.
As we introduced before, the functions of Dall·E2 include generating images based on text, modifying image content based on text prompts, and extending multiple images with similar styles and content based on one image. In contrast, Imagen is more focused on generating highly realistic images based on text.
According to Imagen's official website, in order to compare the performance of Imagen with other text-image models (such as DALL-E2) in image generation, Google established a text-image model evaluation benchmark called DrawBench. This is a list with 200 prompt texts. These prompt texts are input into different models to output images, and then humans participate in the evaluation. Google said that under this benchmark, participants in the test generally believed that "Imagen outperformed other models in side-by-side comparisons, both in terms of sample quality for image generation and consistency between image and text."
Imagen's test results were compared with other models under the DrawBench benchmark. Photo source: Imagen official website
The picture is produced in 10 seconds, and the fake is the real one! Is the designer's wish to "edit pictures with his mouth" coming true?
Hello everyone, I am talking to you about design peanuts. I previously recommended the AI picture generator Disco Difussion to you. It can automatically generate magnificent and fantasy art based on text prompts, and is very suitable as a source of inspiration for artistic creation.
Google:Parti
Parti official website address: https://parti.research.google/
Parti is another text-image generation model that Google launched shortly after the launch of Imagen. Both focus on generating realistic images from text. The difference is that Imagen is a diffusion (CLIP) model, while Parti is a Pathways Autoregressive Text-to-Image generation model, which enables high-fidelity, highly realistic image generation.
According to the official website, Parti trains its own model by studying a set of images to generate another set of new images. The more images available for study, the more realistic the images generated will be. During the training process, Parti increased the number of reference images from 350 million to 20 billion, which also made the fit between the generated images and the text reach 75.9%.
And Google found that with 20 billion image references, Parti was particularly good at generating images about abstraction, world knowledge, specific perspectives, writing and symbols. It was also found that Parti could handle long and complex prompts, especially if they dealt with the following aspects:
Accurately reflect world knowledge
There are many participants and objects, with fine details and interactions
Observe specific image formats and styles
Google also lists multiple sets of prompt text and output images as examples to show how Parti responds to changes in participants, activities, descriptions, locations and formats.
Although Google demonstrated Parti's advantages in image generation on its official website, it also admitted that the examples displayed were carefully selected from many experimental results. He also said that although Parti can produce high-quality output based on a wide range of prompt texts, its model still has many limitations, such as erroneous presentation of the number and characteristics of texts, and error handling of negative words and the absence of prompt words.
Meta:Make-A-Scene
Official introduction: https://ai.facebook.com/blog/greater-creative-control-for-ai-image-generation/
Make-A-Scene is a new AI technology announced by Meta on July 14. Its biggest feature is that it can generate specific images based on rough sketches created by users and combined with text prompts, making the generated images more controllable.
"In order to fully realize the goal of artificial intelligence to promote creative expression, people must be able to influence and control the content produced by these intelligent models. Users should be able to express their thoughts in any way they like, including voice, text, gestures and even drawings, and should be easy to use and intuitive." This is the point put forward by Meta in his introduction article on Make-A-Scene, which also well demonstrates the significance of Make-A-Scene.
Compared with models such as Dall·E2 and Imagen, which generate images based solely on prompt text, the images created by Make-A-Scene are more controllable. Users can control the final image through sketches
延伸阅读:
暂无内容!
评论列表 (0条):
加载更多评论 Loading...