Teaching Claude How to Make Memes with MCP

June 17, 2025 0 reads

As MCP started taking off several months ago, I decided to see if I could use it to help Claude create memes. Here’s a look at my meme MCP server and what I’ve learned about the protocol.

What is MCP?

At a high level MCP is a standardized protocol for allowing AI to use "tools" and external data. LLMs (large language models) on their own only know about their training data and prompts. Tools are a way to enhance the capabilities of an LLM, and allow it to read from and write to external data sources like a text file on your local machine or today's weather queried from an API. MCP provides a standardized, modular way for LLMs to invoke tools, replacing bespoke integrations with reusable components.

The MCP docs provide much more detail on the protocol, and a very helpful tutorial for getting your feet wet.

MCP Clients

In order to use MCP, you need an MCP client. You can either set up and run your own client locally using the tutorial in the MCP quickstart guide or you can use a managed client provided by one of the following products:

Throughout this blog post I will be using Claude Desktop.

Currently the standard for configuring an MCP client is using a JSON file like the one below:

json

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/username/Desktop",
        "/Users/username/Downloads"
      ]
    }
  }
}

This file determines which MCP servers (or tools) are available to your AI model. The name and location of this file will depend on the MCP client you are using, so refer to the relevant docs for configuration details.

MCP Servers

What do we put in this JSON file? In the config above, we are using one of the example MCP servers provided by Anthropic. The @modelcontextprotocol/server-filesystem server allows your MCP client to read and write files on your local machine. This means if you ask Claude "Write me a story and save it to my Desktop" it can use the tool and actually save the file to your machine.

While this is a simple example, you can imagine how MCP can become a powerful natural language layer on top of many tools and workflows.

Making Memes

Now that we know how MCP works, let's build our own MCP server to teach Claude how to make memes.

Getting Started

To start, I found an example MCP server on Github from someone who had a similar idea: https://github.com/haltakov/meme-mcp. You can follow the README to setup and run the server locally with your own MCP client. The server itself is just a simple NodeJS server with a single tool defined:

typescript

{
  "generateMeme",
  "Generate a meme image from Imgflip using the numeric template id and text",
  {
    templateNumericId: z.string(),
    text0: z.string(),
    text1: z.string().optional(),
  },
}

When Claude uses the tool, it chooses and supplies values for these fields. The server passes these values to the ImgFlip API to actually create the meme, then it fetches and returns the image in the response. Let's try it out.

Where do the IDs come from?

Interestingly, Claude just knows the ImgFlip IDs of many common meme templates. This is likely because they are included in training data and popular memes are referenced widely on the internet, but still this came as a surprise to me initially.

I did find though, that Claude sometimes chooses a template that doesn't quite fit, and seems to have a bias toward a handful of popular templates when generating memes. I wanted to be able to more reliably choose a relevant template based on the user prompt rather than always rely on the latent mapping of concept to ImgFlip meme template that's baked into the LLM.

Adding RAG and Semantic Search

To enhance Claude's meme making capabilities, I want to provide another tool to run a semantic search based on the given prompt to find relevant meme templates. This new tool will implement the following interface:

javascript

  "findMemeTemplates",
  "Find meme templates that match the semantic meaning of a query",
  {
    query: z.string().describe("The concept, situation, or idea you want to express with a meme"),
    limit: z.number().optional().describe("Number of templates to return (default: 5)"),
  }

In order to do this, I need to setup a vector database, populate it with embeddings of ImgFlip meme templates, and create a simple backend server to run a semantic search against the vector DB based on the user prompt. Claude can then select from the template IDs in the response to create a meme using our existing MCP tool.

Backfilling the Meme Template DB

I decided to use Supabase with Postgres and PgVector for my Vector Database. I created a table with the following schema:

sql

CREATE TABLE meme_templates (
  id SERIAL PRIMARY KEY,
  template_id VARCHAR NOT NULL,  -- Imgflip template ID
  name VARCHAR NOT NULL,
  url VARCHAR NOT NULL,
  width INTEGER NOT NULL,
  height INTEGER NOT NULL,
  box_count INTEGER NOT NULL,
  captions INTEGER,
  description TEXT,
  embedding VECTOR(1536)  -- For OpenAI embeddings
);

CREATE INDEX ON meme_templates USING ivfflat (embedding vector_cosine_ops);

Next, I setup a backfill script to fetch meme templates from ImgFlip, generate a detailed description, compute an embedding, then write a record to the database. To start off, I only included the top 100 meme templates from the ImgFlip API, but this could be extended to an arbitrary number. After fetching the template from ImgFlip, I used the prompt below to generate a context-rich description of the meme template.

python

    prompt = f"""
    Create a rich semantic description for the meme "{meme['name']}" that explains:
    1. What situations this meme is typically used for
    2. The emotional tone it conveys
    3. Common text patterns that work well with it
    4. What concepts or scenarios it best represents
    
    Meme details:
    - Name: {meme['name']}
    - Image: {meme['url']}
    - Text box count: {meme['box_count']}
    
    Provide only the description without any introductions or explanations.
    """

For example, here is the description produced by this prompt for the "This is Fine" meme template:

The "This Is Fine" meme is typically used to depict situations of chaos, crisis, or overwhelming stress where the individual involved is in denial or choosing to ignore the severity of the circumstances.
The emotional tone conveyed by this meme is one of ironic detachment or resignation. It humorously captures the feeling of being overwhelmed or facing a disaster while maintaining a facade of composure or pretending that everything is under control.
Common text patterns that work well with this meme include phrases such as "This is fine," "I'm okay," or "Everything is alright," juxtaposed against a background of chaos or destruction, highlighting the stark contrast between the character's words and the reality of the situation.
The "This Is Fine" meme best represents the concept of coping mechanisms, denial, and the tendency to downplay or ignore serious problems in the face of adversity. It is often used to satirize situations where individuals are in denial about the seriousness of a crisis or are trying to maintain a sense of normalcy in chaotic circumstances.

Finally, the backfill script vectorizes this description for each template using the OpenAI embeddings API, and writes a record to the database.

Vector Search from MCP Server

Now that we've backfilled the Vector DB, we can implement the findMemeTemplates tool. I added the following function to the meme_templates table:

sql

CREATE OR REPLACE FUNCTION match_memes(query_embedding VECTOR(1536), match_threshold FLOAT, match_count INT)
RETURNS TABLE (
    id INT,
    template_id VARCHAR,
    name VARCHAR,
    url VARCHAR,
    width INT,
    height INT,
    box_count INT,
    captions INT,
    description TEXT,
    similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
    RETURN QUERY
    SELECT
        mt.id,
        mt.template_id,
        mt.name,
        mt.url,
        mt.width,
        mt.height,
        mt.box_count,
        mt.captions,
        mt.description,
        1 - (mt.embedding <=> query_embedding) as similarity
    FROM
        meme_templates mt
    WHERE
        1 - (mt.embedding <=> query_embedding) > match_threshold
    ORDER BY
        mt.embedding <=> query_embedding
    LIMIT match_count;
END;
$$;

When Claude calls the findMemeTemplates tool, the server will invoke the match_memes function using an embedding computed from the user's prompt. We will rely on Claude and our tool definition to map the user prompt to the right intention and semantic query.

Putting it all Together

Now we can tell Claude about this new tool by updating our MCP JSON file with the new tool configuration and restarting Claude Desktop. Let's try it with the same prompt from earlier.

javascript

{
  `limit`: 5,
  `query`: `worried about losing job technology automation fear`
}

Found 5 matching meme templates:

"Anime Girl Hiding from Terminator" (ID: 224514655)
1. This meme is typically used to humorously depict situations where an individual, represented by the anime girl, is trying to hide or evade a powerful, unstoppable force symbolized by the Terminator. It is commonly applied to scenarios where someone is attempting to avoid a looming threat or an unavoidable outcome.... Text boxes: 2
"AJ Styles & Undertaker" (ID: 234202281)
1. This meme featuring AJ Styles and Undertaker is typically used to humorously depict a situation where someone confidently challenges a formidable opponent or situation, only to be met with unexpected consequences or overwhelming odds.... Text boxes: 2
"Marked Safe From" (ID: 161865971)
1. The "Marked Safe From" meme is typically used to humorously declare oneself as safe from a specific event, trend, or occurrence that others may find undesirable, absurd, or dramatic. It is commonly applied to situations that are exaggerated or unrealistic, serving as a satirical commentary on modern concerns or societal issues.... Text boxes: 2
"This Is Fine" (ID: 55311130)
1. The "This Is Fine" meme is typically used to depict situations of chaos, crisis, or overwhelming stress where the individual involved is in denial or choosing to ignore the severity of the circumstances.... Text boxes: 2
"0 days without (Lenny, Simpsons)" (ID: 427308417)
1. This meme is typically used to humorously highlight a lack of success or continuous failures in a specific area. It is often applied in situations where someone or something is prone to making mistakes or experiencing setbacks regularly.... Text boxes: 2

Not bad! Let's try another one.

javascript

{
  `limit`: 5,
  `query`: `comparison old way versus new way simple versus complicated`
}

Found 5 matching meme templates:

"One Does Not Simply" (ID: 61579)
1. The "One Does Not Simply" meme is typically used to convey the idea that a certain task or action is not as easy or straightforward as it may seem on the surface. It is often employed to express a sense of difficulty, complexity, or impossibility in achieving a particular goal or completing a given task.... Text boxes: 2
"Two Paths" (ID: 309668311)
1. The "Two Paths" meme is typically used to depict a decision-making scenario where two contrasting options or choices are presented. It is commonly employed in situations where individuals must weigh the pros and cons of different paths before making a decision.... Text boxes: 3
"Two Buttons" (ID: 87743020)
1. The "Two Buttons" meme features an image of two red buttons, each labeled with a contrasting option or outcome. This meme is typically used to humorously depict decision-making dilemmas or moral quandaries, where the choices presented are extreme or absurd. It resonates with situations where individuals are faced with difficult choices or conflicting desires.... Text boxes: 3
"Drake Hotline Bling" (ID: 181913649)
1. The "Drake Hotline Bling" meme is typically used to showcase situations involving two contrasting options or choices, with Drake (on the right) approving of one option and disapproving of the other. It is commonly employed to humorously illustrate decision-making dilemmas or choices that are easily distinguishable in their desirability.... Text boxes: 2
"Gus Fring we are not the same" (ID: 342785297)
1. The meme "Gus Fring we are not the same" is typically used to highlight stark differences between two entities or individuals in various situations, such as contrasting ideologies, behaviors, or outcomes. It serves as a visual representation of asserting superiority or distinctiveness.... Text boxes: 3

Overall the candidate templates and the memes feel much more relevant to the initial prompts than our original approach with the naive MCP server relying on the latent mapping of ImgFlip template IDs.

Of course, the quality of memes is something that's subjective and difficult to measure. More extensive testing, evaluations, and prompt adjustments are necessary to assess and tune the quality of the results.

Beyond Memes: When is MCP Useful?

Using MCP to generate memes is of course a toy use case, but it illustrates the power of this protocol and the idea of tool use. There is a lot of excitement around MCP, however, and some have dismissed the tech as a fad. While I agree there is a lot of hype, I do think there is a wide set of use cases where standardized tool use is very powerful.

Natural Language Glue and Skipping the UI

Most MCP servers expose tools that are thin wrappers around existing product or service APIs. These APIs are deterministic, but discovering the right parameters is often painful and time consuming for users. Complex platforms ship dozens of endpoints with sprawling option sets, and their most useful features are hidden behind cluttered UIs or outdated docs. This is one use case where MCP shines: providing a natural language interface over existing APIs.

With MCP, the user can ask for what they want in natural language, then iterate with an LLM to refine the result. The user doesn't need to know how the underlying tool or APIs work, that is passed on to the LLM and the MCP server.

This natural language interface also acts as an alternative to a tranditional visual user interface. Products with a clunky UI/UX or steep learning curve can lower the barrier to entry for new users with a tight MCP integration. Creative pro suites (graphics, 3-D, video, audio), industrial CAD tools, enterprise dashboards (CRM, work-item trackers), and DevOps tools are prime candidates for an MCP-powered natural-language layer. Some of the early examples I have seen, like this Blender MCP server demo, are an incredible glipse at what this tech unlocks.

Agentic Workflows

MCP moves us beyond the human-prompted chatbot and into the realm of agents: LLMs equipped with tools that can act autonomously. These agents can string together tools from multiple MCP servers to accomplish complex tasks independently, and they’re quickly becoming capable collaborators that work alongside us. It’s an exciting future to watch and to help build.

For some more detailed resources on agents, see: