Imagine you're looking at a photo and you wonder, what's just outside the frame? Outpainting, tackles this curiosity. It's like a speculative artist extending the boundaries of a picture with what they believe could be there.
It's an AI-powered extrapolation of visual data, extending your photo's story further than what's originally captured. In other words, it's image expansion or AI-powered zooming out.
Input image of a car standing on a mountain pass at a resolution of 1024 x 1024 pixels. | Outpainted image generated in Boostpixels with what could potentially exist beyond visible frame. |
To understand this concept in a straightforward manner, picture this scenario: A photographer is standing on a mountain pass, the dirt road underfoot. Before him is a white Ford Expedition vehicle, a symbol of adventure. He captures a portion of this awe-inspiring scene with his camera. Now, imagine you are viewing this photograph and you're intrigued. You want to know what lies beyond the corners of the captured image. What might the rest of the scene look like?
Outpainting is instructing the AI to broaden the photo beyond its original frame. Essentially, it takes an existing image and generates a plausible extension, "imagining" what might exist outside the current visible area.
This process is more advanced than earlier methods like 'Content-Aware Fill,' introduced in Photoshop CS5. Content-Aware Fill only populated missing parts based on the original image's content.
Outpainted image generated in Boostpixels. | Content-Aware Fill in Photoshop 24.7 Beta. |
Adobe is making strides in implementing generative AI functionalities, achieving remarkable results using the same outpainting method. Evidently, Adobe utilizes an algorithm that meticulously analyzes the input image, paying close attention to crucial features to ensure accurate representation.
Outpainted image in Boostpixels. | Generative Fill in Photoshop 24.7 Beta. |
Adobe adopts a more cautious approach when it comes to augmenting images with additional objects and patterns. By adhering closely to the details present in the original image, it minimizes the risk of introducing unintended artifacts. This approach aims to increase the likelihood of generating images that align with the expectations and preferences of Photoshop's user base.
Recursive Image Outpainting
But why limit ourselves to a single instance of outpainting? We can initiate this process repeatedly to further broaden our image, unveiling more and more of the AI's imaginative creation. This recursive outpainting can be continued indefinitely, opening up a world of artistic possibilities that can result in uniquely compelling visual narratives.
2x Outpainted image in Boostpixels. | 2x Generative Fill in Photoshop 24.7 Beta. |
Race for the Best AI-Driven Image Outpainting Tool
Major players in the field, including Stability AI, OpenAI (DALL-E) and Midjourney are locked in a race to perfect Outpainting.
Stability AI, key organization in the Stable Diffusion ecosystem, has made a substantial stride with the launch of a new free tool called Uncrop. The engine driving Uncrop is Stability AI's text-to-image model, Stable Diffusion XL. It is free to try out here, with no need to log in: https://clipdrop.co/uncrop
2x Outpainted image in Dall-E. | 2x Outpainted image in Clipdrop by Stability AI. |
Compared to Adobe Photoshop and Boostpixel, the performance of Uncrop in this example does probably not align with the desired outcome.
Although DALL-E fares somewhat better, the images have some repetitive parts like the clouds that don't look realistic.
Outpainting builds on a more fundamental AI technique called inpainting which is the process of filling in missing parts of an image in a realistic way. To effectively inpaint an image, the system needs to understand the image's large-scale structure and be proficient in image synthesis. Inpainting systems are
therefore trained on vast datasets, which are generated by masking random portions of real images.
The key to successful inpainting (so thus outpainting) is having a large effective receptive field in the network. This term refers to the region in the input space that a particular neural network unit can "see" or analyze. A broad receptive field allows the system to understand the global structure of an image.
Boostpixels use the Large Mask Inpainting (LaMa), uses Fast Fourier Convolutions (FFCs) and the use of perceptual loss, based on a semantic segmentation.
Source: Resolution-robust Large Mask Inpainting with Fourier Convolutions, Suvorov and Logacheva et al.
Challenges of Outpainting
It's crucial to acknowledge that outpainting, while undeniably innovative, is a complex challenge. Essentially, it's asking an artist to create a fitting continuation of a famous painting just by looking at a small piece of it. There are no concrete rules or data points to rely upon, just the understanding of the initial image.
In the example of the white Ford Expedition on the mountain pass, the outpainting algorithm faces multiple uncertainties. Should there be more mountains, or perhaps a forest? Should it include more cars, people, or a lone deer grazing? Each of these decisions is speculative, based on the model's understanding of the existing picture and the massive datasets it was trained on. The resulting outpainted image is the AI's best calculated guess at what might exist outside the original frame.
Outpainted Mona Lisa painting in Boostpixels. |