There are literally thousands of AI tools out there and as a part of the IDist AI newsletter I will only cover applications I use and review, and what I consider to be the gold standard in AI development.
Here we’re going to compare two AI applications that are designed to create photorealistic images from text to image ‘prompts’ or (instructions): Stable Diffusion and Midjourney. Both applications have their unique strengths and weaknesses, and in this article, we’ll highlight some of the pros and cons of each.
Let’s start with Midjourney. This application has some impressive features, including multi-image input blending, the newest feature of “describe,” and the ability to generate high-quality images from small language prompts. The overall model quality is also excellent, and the application is relatively easy to use. Personally I would rather have a typical application dashboard rather than entering text instructions into a discord server, but still you can be up and running very quickly, and there are other benefits to using discord.
Stable Diffusion offers several advantages that set it apart from Midjourney. For one, Stable Diffusion can train custom models, which means users can create characters, styles, items, locations, fashion, poses, and more.
The application also offers control over the output through features like ControlNets and LORAs. Moreover, Stable Diffusion has a large community of custom models that users can build from. It offers several features like animation, inpainting, outpainting, and custom upscale models. My favorite aspect of SD is that it has fine control settings, which allow users to choose which encoder (or VAE – Variational Autoencoder) to use and control the specific images and parameters to modify.
One significant advantage of Stable Diffusion is that it has limited censorship, which is not the case with Midjourney. Additionally, Stable Diffusion offers a range of tools to control the output, making it a valuable tool for creating high-quality photorealistic images. With some training Stable Diffusion V2.1 can create stunning images of humans.
Here is an example using the prompt with Stable Diffusion:
Prompt: a portrait of a beautiful blonde woman, fine – art photography, soft portrait shot 8 k, mid length, ultrarealistic uhd faces, unsplash, kodak ultra max 800, 85 mm, intricate, casual pose, centered symmetrical composition, stunning photos, masterpiece, grainy, centered composition : 2 | blender, cropped, lowres, poorly drawn face, out of frame, poorly drawn hands, blurry, bad art, blurred, text, watermark, disfigured, deformed, closed eyes : -2 / Stable Diffusion v2.1-768
Here is the same prompt with Midjourney (with negative prompt as well, in Midjourney for negative prompt write –no before all the items you do not want included):
If a user wants to turn an image into another style, they can use LORAs. If they want the generator to get inspiration from a specific concept, they can use text inversions. Stable Diffusion also has a plugin that can be used in real-time while working on Photoshop.
However, there are some downsides to using Stable Diffusion. For one, it has a steep learning curve and is not considered newbie-friendly. Nonetheless, with UI tools like a1111, this can be eventually fixed. Additionally, while Stable Diffusion offers extensive control over the output, this level of control can make the application challenging to use. The black box right now even for founders is know what all of
Both Stable Diffusion and Midjourney have their unique strengths and weaknesses when it comes to creating photorealistic images. Midjourney excels at text2img, while Stable Diffusion’s range of tools and extensive control over the output make it a valuable tool for creating high-quality images. However, Stable Diffusion has a steeper learning curve and is not newbie-friendly. Ultimately, the choice between the two applications depends on the user’s needs and preferences.