Nate Herk | AI Automation • 17:25 • 8 segments
Nen-powered Photoshop AI agent combines Nano Banana image generation with Google Drive file tools, automates renaming, and is controlled via Telegram for quick ad‑creative workflows.
The system uses a unified JSON message format, a minimal “personal assistant” prompt, GPT‑4.1 with Sonnet fallback, and chat‑ID memory to handle both text and photo inputs seamlessly.
Custom workflows download images, upload to ImageBB for public URLs, call Nano Banana APIs, poll for completion, then store results back in Drive; the workflows are open‑source and free.
foul.ai offers a single credential to access multiple image models at roughly $0.04 per image (25 free images for $1), simplifying cost‑effective API integration via Nen.
The channel provides free community resources, plus paid courses and a monetization community, while future plans include a prompt‑generation agent, detailed logging to Google Sheets, video‑conversion tools, and deeper agent integration for production use.
"Please combine the Nate and granola pictures to make a photorealistic image where the man is holding their granola while hiking on a mountain."
00:50
"" So, if you remember over here, we have a raw file of granola"
00:50
"And remember, these images are only going to get better and better when you come in here and you customize the prompts and you customize all this other stuff because what's actually going on is this agent is creating its own prompt when it sends over variables to this workflow"
00:50
"You are a personal assistant agent. Your job is to use the tools you have access to to help the user with their request."
05:02
"What would you like me to name that photo in your Google Drive?"
05:02
✓ Obsidian export ready
./data/notes/7UNsK9LoORo_I_Built_a_Photoshop_AI_Agent_in_n8n_with_no_code_N.md
Today I'm super excited to share with you guys this Photoshop AI agent that I bu...
can see that our workflow is listening to us. I'm going to send a picture on Tel...
now that you guys have seen how this works, let's start to dive into what's goin...
to go over the file handling tools because they're super simple and we'll just g...
database. Okay, so now let's get into the fun stuff which are these two custom w...
Real quick, just wanted to touch on some of the pricing for this system. The fir...
too bad. Okay, so that's basically all that's going on in this workflow. Now, le...
agents as you want. So once again, all the resources for this video and every si...
Today I'm super excited to share with you guys this Photoshop AI agent that I built in Nen with no code. It uses Google's new nano banana image generation model which is absolutely insane. It's really changing the game for ad creatives and UGC content. So I don't want to waste any time. We're going to dive into it and I'm going to show you guys exactly how this system works. And as always, I'm going to give all the resources that you need to set this up completely free. So stick around so I can show you guys how to do that. All right. So here is the master Photoshop agent. And you can see it's not too complex. Up front, what we have is the ability to take text or image as an input. And then the agent has five different tools to choose from. It's got two image genen tools where it can combine images or edit existing images. And then it has three file handling tools in Google Drive where it can change the name of a file, search through raw files, or search through AI generated images that it's created for us. And it can manage all of this right in your pocket, right through Telegram. So let's hop into a demo. Okay, so you
can see that our workflow is listening to us. I'm going to send a picture on Telegram. Okay, so I just shot that off and now you can see it's uploading that to Google Drive. The agent's going to see that and then it's going to ask us what we want to name that file in our Google Drive. So there you go. It just responded with what would you like me to name that photo? And I'm shooting back a message that says call it Nate. And what it's going to do now is use this change name tool to change the name in our Google Drive to Nate. As you can see, if I switch over real quick to media, it just got called Nate. And originally it would have been called today's date. So now it's asking us what we want to do next. And I'm going to shoot off this picture of a bag of granola. And once again, it's going to ask us what do we want to name this file. And just to show you guys real quick, in the media folder that it's dropping, like I said, it defaults to naming everything today's date. So now I'm going to show you guys that it actually does change it. All right, so shooting off this message that says to call that picture granola. And once again, we'll see it go here and change that picture to granola. And then we'll have it combine the images together. All right, that just finished up. We'll check the media folder, and we can see that picture got called granola. Okay, so I've got Telegram open on my desktop now just to make this easier. And it's asking us what to do next. I'm shooting off this message that says, "Please combine the Nate and granola pictures to make a photorealistic image where the man is holding their granola while hiking on a mountain." So, if you remember over here, we have a raw file of granola. We have a smiling selfie of me. And it's going to put these together. Right here, you can see it's using its combined image tool, and it's going to use nano banana to make it look like I'm holding the granola on a mountain. So, I'll check in with you guys when that's finished up. Oh, and one other thing. While this is running, normally the agent would go search through the files to make sure it has the right IDs to combine the images. But because it's using its memory, it already knows that. So that's why it didn't hit the other file handling tools. But you can see we got a response over here. It's combined the images. Now, let me click into this Google Drive link and we'll take a look. So there we go. We've got the kind granola with all the spelling correct. We've got me and my face holding the granola while hiking on a mountain. And just to show you guys what that would look like if the files weren't just immediately uploaded and we wanted to pull different ones. So let's see. We have a Hermosi picture right here that's insanely low quality. We've got a JBL speaker picture right here. So what I'm going to do now is say please combine the Hermosi image with the JBL speaker image to make it look like the man is listening to the speaker on a boat. And now you can see it's searching through the RAW files. It has those correct IDs. And now that it has those file IDs, it can pass that to the combine images tool in order to actually send those both to Nano Banana and get back an image. There you go. You can see it's calling the combined images tool. So, I'll check in with you guys when we get that back. All right, so it just told us it was done. I'm going to go to the AI image generation folder, and you can see we do have a new one. Let's click in to see how it turned out. Not too bad at all. We have the JBL speaker. We've got Hermoszi in his acquisition.com beater, and it looks pretty good. And remember, these images are only going to get better and better when you come in here and you customize the prompts and you customize all this other stuff because what's actually going on is this agent is creating its own prompt when it sends over variables to this workflow. So, if we had a dedicated agent just focused on like creating optimized AI image generation prompts, the results would probably be a lot better. All right. All right. And just to show off the functionality of the edit image tool, I'm going to shoot off this message that says to create a photorealistic advertisement of the granola image, make it look like it's being held in front of the Eiffel Tower. So, this one is going to have to find the file ID of the granola image, whether that's in the memory or if it has to search the raw files. Looks like it just searched the raw files as you can see. And now it's going to call the edit image tool in order to change the appearance of that image. Okay, so looks like that just finished up. I'm going to go to the AI image generation folder. You can see we have granola ad Eiffel. If I click on it, you can see that it is our bag of granola which looks exactly like that and it is being held in front of the Eiffel Tower. So, if that's not a good ad creative, then I don't know what is. And so, all of the words are pretty much spot-on except for right up here at the top. As you can see, this is supposed to say ingredients you can see and pronounce. But two things, the first thing is that in the RAW files, this is a pretty low quality image. So, like I can't even read what that says right there. And then the second thing is this image generation model is only going to get better and better and of course the prompting that is in this current workflow is very minimal. But anyways,
now that you guys have seen how this works, let's start to dive into what's going on with all of these nodes. So the first thing I'm going to start off with is the text to image input. What happens here is we're using a switch basically to detect if a photo exists or if text exists. And if photo exists, we go up this way where we download it, upload it to Google Drive, and then we shoot that off to the agent. But if text exists, then we just shoot the text straight away. But what we have to do is make sure that the variables are the same so that it's always coming through in a field called JSON message.ext. So we basically just standardize the input so the agent looks at it no matter what comes through. Naturally, the next thing that we'll go over is the system prompt, which is very, very minimal. I said, "You are a personal assistant agent. Your job is to use the tools you have access to to help the user with their request." I listed out all the tools and I gave a very minimal description of each of them because in the tools themselves, there's a brief description, but I still like to put them here. And then for the instructions, I only ended up giving it one instruction, which was if the user submits a photo, ask what they want to name it by saying, "What would you like me to name that photo in your Google Drive?" Then once they respond, change the name using your change name tool. The system prompt is very minimal. We have only five tools to choose from, and we're using GPD 5.1 and it's doing really well. And from here, what I would do is as I'm testing more and as you guys download the template and start to play around with different things, just add instructions to your system prompt when you realize it's failing to do certain things. And then real quick, just to hit on what's going on over here before we dive into the tools is we're using GBT 5.1. We're using Sonnet 3.5 as the fallback model. And then we're just using simple memory with the session ID being the Telegram chat ID. All right, so first I'm going
to go over the file handling tools because they're super simple and we'll just get that out the way. The first one is change name. So if I click into this tool, you can see what we're doing is we're updating a file and we're updating it by the ID and the ID is automatically found by the agent. So if the agent doesn't have the ID of a file to change the name, it will have to go use its other tools to find that ID and then it will pass the ID to this tool. And the other thing that gets passed to this tool is the new name. So when we say, hey, call that photo Nate, it knows to fill this in right here with the word Nate. And then these tools are doing the exact same thing. The only difference is the folder that they're searching in. So if I say to search for raw files, what we're doing is we're searching within a folder called media and we're searching for files. But if the agent realizes that we want to search through an image that it already created for us, it would use this one which is search AI images and it's doing the exact same thing except for we change the folder that it's looking through. So like I said, those three are super simple, but those are necessary in order to change things, look at things, and search through the
database. Okay, so now let's get into the fun stuff which are these two custom workflows that I built. And what that means is if I open up this workflow, you can see that it is a custom Nadin workflow that I built myself. We're able to make our main agent call on it because you can see that if we add a tool right here, we have an option to call an Nin workflow as a tool. So that's how we can create these really modular systems because now if I ever want to create another agent that can combine images, I can just plug it into this workflow right here. So anyways, there's a lot of stuff going on here. So let's just take it node by node. The first thing that's going on is we have the input where I'm defining to the main Photoshop agent what to send over. So you can see we're sending over an image prompt, image one, which is an image ID, image two, which is the image ID of the second image, and then the image title for the new image that's created. And that may seem a little confusing, but when we go back to the main Photoshop agent, it looks at this and it says, "Okay, when I need to call this tool, which is combining images, I need to send over a prompt, image one, image two, and image title." So, this is how it knows to use its brain to give the second workflow all of this information. And to make this easier, I pulled in the live data from that run we did in the demo, just so you guys can see that what comes in is here's the image with Hormosi in the speaker. Here's the ID of the first one, the ID of the second one, and the new name of the AI generated image. So from there, what we do is we edit fields just to create an array of the two image IDs. We have to create an array so that we can split these out into two separate items because we need to download two files and we need to get a public URL of both of those files. So in the Google Drive node, we're downloading by ID. So you can see we're getting both of these pictures here. There's the Hormosi one and there is the speaker one. And then the way that the nano banana API works, it has to take a public URL of an image in order to actually change it. So what we do is we use a free service called image BB and I'm basically just uploading these images as binary and then it gives us back a public URL that represents the image. So here you can see this is what we get back and if I open that up, it is the picture of Herozi. So that's just a cool little workaround if you need to get an binary image into a public URL. There's other ways to do it, but this is just a free easy way that I do it. So once we have both those URLs back, we aggregate them so that we can make one API request rather than two. So we're making one request to Nano Banana through a service called FAL AI. So I'm not going to dive super super deep into what's going on here. If you want to watch an API video I made, I'll tag that right up here. But what's going on is we're using our foul credentials. So I got my API key from foul.ai. And then this JSON body request is really simple. We're passing over a prompt and we're passing over two image URLs. And the only reason why this prompt looks a little confusing is because I put these two replace functions in there that basically make sure if the prompt has new lines or um double quotation marks that it gets rid of that because that would break the JSON body request. So now it has everything it needs. It has both the images and it has the prompt. And then it basically says, "Okay, I received your request. We're working on that right now." And because it's working on it, we wait for about 10 seconds and then we make a request back to foul to see if it's done. And if it's not done, it will come here and it will wait for 30 seconds. And honestly, this should probably change to like four because images are really quick. And then it will just continuously check until it's done. So that's just another cool guard rail you can have in your workflows. And then once it gets that result back, it basically gives us a URL. And so what I do is I make a simple git request to that URL to actually get the image itself. And now we have the image as binary. We can upload it to Google Drive. And then we can set our response to the main agent which is basically saying, "Hey, the image was created and this is what we named it and here's the link to that image in your Google Drive." And then the main agent gets that and responds to us, the user. So hopefully that wasn't too confusing and at a high level you can understand what's going on. Remember this will be a free workflow you guys can download in my free school community as well. So the best way to really understand it is to download all the assets, run it, and then go node by node and understand what's going on. And real quick, the way you would actually get all this is you join my free school community. The link for that down in the description. When you get there, this is what it will look like. And all you need to do is go to YouTube resources and find the post associated with this video. If you're having trouble finding it, you can also use the search bar to search for the title of the video. But then once you get in there, you'll see the video and then you'll also see all of these JSON files that you can download and then import directly into your NAD. And then when you get all the stuff set up, there will be a big sticky note right here called a setup guide and it will tell you exactly what you need to do to get up and running. Okay, cool. So that was the combine images node where we used nanobanana. Now what's cool is the edit image node is very very similar. So if I open this up and I actually just click on view subexecution, we can see exactly what happened when it was called. And it's very similar to the previous one. Literally the only difference is that instead of passing in two image IDs, we're only passing in one. So what happens once again is the input is image title, image prompt, and image ID. So here's the run from the granola Eiffel Tower ad. As you can see, we're going to go to Google Drive once again, pass in that image ID so we can download the file. Then what comes next is we are once again sending that binary data to imageB to get a public URL. So right here you can see this is the public URL we get for our granola picture. And then now that we have that we can make another request to FAI to use Nanobanana where we send over the prompt and then just one image URL instead of two. So it's very very similar. We're using the same guardrails in here to get rid of new lines and new spaces and um quotation marks, all that kind of stuff. And then it basically just edits that image. We're doing the exact same thing here with a polling check. And then when we get the result, we download it as binary rather than keeping it as a URL, put it in our Google Drive, and then we send a response back to the main agent.
Real quick, just wanted to touch on some of the pricing for this system. The first thing that I wanted to cover was foul.ai. What foul is is like a place where you can have a ton of different image and video generation models and you can just use one credential and, you know, get all of them through there. So, it's really cool. You can see right here my recently used our Nano Banana Edit. We've got V3 fast, V3. You can go to explore and you can see all of these other models that they have available. But anyways, the reason why I wanted to show you guys this is because first of all, it's only about 4 cents per image, which is not too bad at all. But what you can do is you can play around with like image URLs and prompts here. So you can really refine the way you want to have your prompting before you get into Nen and start messing around there. And then the way that we can call it through nen is by going to their API documentation. I would change this to HTTP curl. And now you can look through like how to get your API key, how to set that up, how to submit a request, how to upload files, all this kind of stuff. And I also did see about a week ago that you could get free image generation through Open Router for Gemini Nano Banana because if you go to the Gemini web app, it's free to try out there. And so for a while it was free here on Open Router, but it looks like they might have just taken that off. So unfortunate, but once again, you can get 25 images for only a dollar. So it's not
too bad. Okay, so that's basically all that's going on in this workflow. Now, let's talk about a few ways that when you get this template that you could customize this and make it a little more production ready. So, something that I alluded to earlier was in these two tools where you're doing image generation would be to have a dedicated AI that would create the system prompts or the AI image generation prompts. So what you could do is like literally right here just add an AI agent node and this one is prompted in a way where it's specializing in creating AI image generated prompts and then you pass that prompt all the way down to the create image node rather than relying on this main agent who has tons of other jobs to do relying on him to make the image prompt. It would be better to have a second agent in this workflow that would specialize in that. There's of course other things you can do like having a logger. So in a previous agent I've made that was kind of similar. This one was an ultimate media team. What I did here was I had it returning all of its steps into a Google sheet, whether it errored or it was successful. So you could see exactly what input was processed, what tools were called, how many tokens that was taking you. So you could use this video, which I will tag right up here if you haven't seen it. You can download those resources and do the exact same thing in this agent where you can have a Google sheet that's logging everything. And then of course another really cool next step would be being able to take these AI generated images and pass that to another workflow that can create videos out of them and you could use like V3 fast to create those images for you. That's another thing that I did in this media agent where I have like an image tovide tool, a create video tool. So definitely check out that video as well if you want to literally use some of these workflows and then just hook them up to your Photoshop agent and because that's the beauty of these custom workflows as a tool. you can have them hooked up to as many different
agents as you want. So once again, all the resources for this video and every single other YouTube video you've seen on my channel, you can get for completely free in my free school community, you just have to come here, look through the YouTube resources, and as you can see, every single video has all of the resources right here. And if you're looking to take your skills a little further with NINDN and you're also looking to understand how you can monetize your AI automation knowledge, then definitely check out my plus community. The link for that will also be down in the description. We've got thousands of members in here who are building and selling end services every single day and they're always sharing their learnings and their challenges. It's a really cool space to be. We've also got two full courses at the moment with a third one coming about monetizing your AI skills. But if you're a complete beginner, you can start with the foundations and then you get in here and you master nen and then you learn how to start selling or consulting with this knowledge. So, I'd love to see you guys in these communities. But that's going to do it for the video. If you enjoyed or you learned something new, please give it a like. It definitely helps me out a ton. And as always, I appreciate you guys making it to the end of the video. I'll see you on the next one. Thanks everyone.