EP 45: A Guide for Complex Prompts in Midjourney
Crafting Compelling Imagery with AI
Posted on February 21, 2024 by Fusion Connect
In this enlightening episode of Tech UNMUTED, George and guest Sam Husbands investigate the fascinating world of Midjourney, an AI tool revolutionizing image creation. They dissect the intricate process of prompt engineering, demonstrating how to craft diverse and vibrant images that resonate with Fusion Connect Marketing's unique style. As they navigate through Midjourney's latest version, they reveal the nuances of image quality, the importance of prompt layering, and the impact of AI on marketing efficiency. This episode is a treasure trove for anyone eager to harness AI's potential in visual creativity and marketing strategy.
Watch & Listen
Tech UNMUTED is on YouTube
Catch up with new episodes or hear from our archive. Explore and subscribe!
Transcript for this Episode:
INTRODUCTION VOICEOVER: This is Tech UNMUTED. The podcast of modern collaboration – where we tell the stories of how collaboration tools enable businesses to be more efficient and connected. With your hosts, George Schoenstein and Santi Cuellar. Welcome to Tech UNMUTED.
GEORGE: Welcome to the latest episode of Tech UNMUTED. Today I'm joined by Sam Husbands. We're going to take a look at Midjourney and some complex prompts and do a breakdown of the layers of things you need to think about when you're creating images within Midjourney. I do want to point out, if you're not a subscriber, please subscribe. Give us a like. Give us a comment. We're definitely looking for the feedback. With that, Sam, I'm going to turn it over to you and we can start to kick this off.
SAM: Yes. Thanks, George. As always, it's an honor to be here. We'll share the screen, and we'll get straight into the Discord channel, where we generate our Midjourney images. For those who haven't seen the previous episode, we used Midjourney to generate most of the art and imagery for Fusion Connect Marketing. If you go on the blog, it's always the master image. You can see here on screen that we have a set style that we like to go for. We like colors. We like white space. We like white screens. We like things to have a perspective to come from the center.
It's taken a while to get to a point where that prompt can provide images that have so much variety. As I scroll down here, these are all images that we will use. You can see that they're very different, but they're from the same family. They're recognizable from us. Now, to get to that point, we have to constantly rebuild prompts. Midjourney has just released its new version, 6.0. Again, we had to go and break down the prompts and we had to understand how to get to a point where we were producing images like this.
GEORGE: What happened, Sam, from version to version? Two questions. Did the prompts not work correctly is one question, and the second, what got better or worse?
SAM: Yes. Good question, George. Thanks. I think the main difference is the quality of image, especially when it comes to people. There isn't many oddities now. If we ask for a lifelike image, they are lifelike images. The layers that they add on just got a lot more detail. They've obviously got a lot more data to pull from. Whilst they weren't wildly different, sometimes they just gave a different perspective. This version to the last is all about light and angles and how we produce an image that looks like it's from the family. It's small details every single time.
You'll see as we go through this that the quality of image and how it blends is the best it's ever been. You can see I'm highlighting here the prompt that we get to this image. It's around 600 characters. It's not the biggest prompt. We can break it down into six parts. I pre-built those six parts. We'll start with a first live demo of a really basic version of the prompt. We're asking it to produce a vertically split image of people in front of two technology offices. It should take five to fifteen seconds, so we might have to kill some time in between generations today, George. We have three or four to go through.
GEORGE: As you go through this, are you stacking one after the other after the other? Is that how you build?
SAM: Yes. The first piece is the most important. It's what the subject of the image is going to be. Vertically split image of people in front of two technology offices. The chances are, as this is generating, we will get something that looks like the base of our images. As before, depending on the time of day, it does depend on the speed of the generation. We can already see, even though it's not completed, that whilst these images are striking and one of them does have a central split and the right kind of perspective, they are so far removed from the type of artwork that we need.
We now need to layer in new elements of the prompt. The second piece for me is I would look at an image like this and think, "Okay, but we don't have any white light that's centering it. It's not a head and shoulders shot that we look for.” If we go back to the next prompt, I've added in two lines. The two lines are talking about the white space, the light coming from a central location. Instead of having distance group shots of people, I'd like a head and shoulder shot of technology experts. In the background, I want to add some of our styling to it. I've asked for two abstract offices and then some computers and data centers to be layered in an abstract way.
This should get a little closer to where we are aiming. These are live tests, by the way, I haven't done these in preparation. I just wanted to follow the logic and see what happened.
GEORGE: What is the rendering time? What times of day work and don't work because you're in the UK? Are earlier mornings better because the U.S. isn't online?
SAM: Yes, exactly that. Two o'clock, which is roughly now, 2:00 PM my time is the worst because East Coast America, it's when you guys are just getting going and it's when we're coming back from lunch generally. This is the slowest time. Instantly, we see that there is some big changes. A few more percent to go before we can see the full set of images. There's now a lot more similarities to our end piece, but to me, they look more like technology, corporate stock footage with some technology and perspectives laid on in the background.
I like the angles, I like where the light's coming from, but the colors, the perspective, the lack of abstract through the people still doesn't match our style. Now we need to go back and think about the layers that would do that. Color is going to be a key part and some words around-- A bit more abstract use of light and colors, but we have a color palette. Here you can see that I've just used the RGB codes across our three main colors to see what that can do. Luckily, that's showed me that I've spelt colors incorrectly. Quick change. By the way, with spelling, Midjourney is amazing. It doesn't really make a difference how you type it in. It will understand if you've made a bad spelling, but still pick it up.
GEORGE: Does the complexity slow it down or not?
SAM: No, no it really doesn't. I find it a surprise, even when we're doing huge prompts with sometimes up to 6,000, 7,000 characters, it doesn't make a difference. It can be lost. You can tell that, especially in Midjourney, if you're writing really long prompts, some of the stuff towards the end of the prompt doesn't get found. It will always favor things at the beginning, which is why the first phrase is the most important.
GEORGE: I've found, I do some images in ChatGPT, and the engine that's in there, it definitely gets lost, or if you try to, in sequence, modify the original image, very often it just gets stuck in a loop on a thing that you can't remove. Then you have to start over in particular words. If it puts words in the image, it's really hard to get it to stop putting words in the image, and a lot of times it's got a misspelling.
SAM: Yes. That's more advanced than Midjourney. Midjourney really struggles with words. I try and avoid it completely and do graphical overlays afterwards. Just adding in the colors has made a huge difference. We could probably get away with using some of these images as ours and people would still understand that they're coming from us, but for me, they're a bit stuffy. They're not abstract enough. I don't think they have enough creativity or artwork, but they are starting to use the colors. The perspectives, the light source from the middle does match our styling. Given that, I'm just going to add one more layer to this prompt. The last layer was about vibrancy, colors, splitting the white space and the colors, and giving the RGB codes. The next two layers are keywords. The keywords are a super important part of this, and this is why you don't want the prompt to be too long so they get lost. The keywords dictate the style and the final layers that they put across. The only additions to the last prompt, a colorful, artistic, minimalistic, creative nebula, which was an odd one that we stumbled across, but it really adds some of the abstract, almost sci-fi nature to some of our images, rainbow and impactful.
Now, just those words alone, plus a slightly different prompt on the ratio, so you'll see that all the images that have come so far are square, we use landscape images. It'll just give it a slightly different ratio. That won't make a difference to the outputs, just how it's laid out. I've layered in Version 6 as well. This is now using its fastest, quickest, most accurate engine, which will add a bit of time to the building of the image generation. Even as that's loading, you can see that these are much more akin to either the background that you see behind me now or the images that we saw earlier and everything that are on our blog.
Actually, we built the base, we added in the perspective, then we added in some of the technology layers and the colors, but it was these keywords that really crafted the style and the impact of the image. If we click on these, I can instantly see that we would definitely use three out of those four. Once we add our Fusion logo, once we add some color hues, and the Fusion swoosh, we're good to go. There wouldn't need to be much change on those.
GEORGE: Amazing. This is like eight, ten minutes. In fact, if you just went to the final prompt, you're talking about one, two minutes.
SAM: Yes, exactly.
GEORGE: We've been doing this for over a year now, so we've had these conversations for quite a long time. If you were to go out and try to create these images from scratch, you would get one set of variations after about a week. We can get multiple variations in a day and then make a decision whether we're going to modify a path forward from an imaging standpoint. It's a dramatic difference from where we were as a marketing team going back eighteen months ago.
SAM: I do think that one of the reasons why we choose the Midjourney prompts to talk about on the pod is because it's easy to understand how great you can make something. It's exactly the same process for GPT or any of the copy and text-based prompts. If you are really taking the prompt engineering to the next level, even from a text perspective, you can create really, really unique and powerful worK that just needs some human intervention to change and perfect. It's just a lot easier to see and visualize from a Midjourney perspective.
GEORGE: Awesome. Sam's going to drop his screen. I do want to do a quick closeout. One, if you're interested in what we ran through, please give us feedback. We will look to do more of those and we want to direct this towards what people want to see. The second thing is what we've continued to talk about on the podcast, which is you get a tremendous productivity hack out of these tools, and you're either using them or you're not. If you're not, you're likely going to fall behind both individually and as an organization. If you are, as you can see from the work Sam's done, you really rapidly enhance the level of productivity you have individually, and then again, it bubbles back up to the team.
Sam, I don't know if you have any final comments or advice for folks before we close this out.
SAM: I'd like to mirror what you just said. It's made such a difference to how we work as a marketing function, and not just on output, but the spend involved in that process. We could not have produced those images. It would have been beyond any in-house designer or agency's ability to do that quickly. We've jumped three or four stages and cut out some of the budgets required to get to a really eye-catching point. Using them here has changed the way that we view everything. All AI tools can help every part of what we do. I honestly believe that we've 10x'ed productivity, and the results are starting to show already. If anyone has any questions, don't hesitate to reach out, and I'll get back to you if I can.
GEORGE: Awesome. Thanks, Sam. With that, we're going to close out today's episode. Again, subscribe and like, give us some feedback. We're happy to engage with folks with questions about this and any of the other podcast topics. With that, we're going to sign off for today. Thanks, Sam.
SAM: Thanks, George. See you guys.
CLOSING VOICEOVER: Visit www.fusionconnect.com/techunmuted for show notes and more episodes. Thanks for listening.
Episode Credits:
Produced by: Fusion Connect
Listen on Your Favorite Podcast Player:
Expert insights, exclusive content, and the latest updates on Microsoft products and services - direct to your inbox. Subscribe to Tech ROUNDUP!
Tech UNMUTED, the podcast of modern collaboration, where we tell the stories of how collaboration tools enable businesses to be more efficient and connected. Humans have collaborated since the beginning of time – we’re wired to work together to solve complex problems, brainstorm novel solutions and build a connected community. On Tech UNMUTED, we’ll cover the latest industry trends and dive into real-world examples of how technology is inspiring businesses and communities to be more efficient and connected. Tune in to learn how today's table-stakes technologies are fostering a collaborative culture, serving as the anchor for exceptional customer service.
Get show notes, transcripts, and other details at www.fusionconnect.com/techUNMUTED. Tech UNMUTED is a production of Fusion Connect, LLC.