An LLM That Can Plan

Based on an interesting interaction I recently had with Claude Opus, the AI language model from Anthropic, I wonder if the current class of LLMs already have the ability to think a step or two ahead to solve a problem, even if they can’t explicitly plan.

I needed to condense an article for a LinkedIn post and asked Claude for help. The task was to get the article under 3000 characters. This problem has some interesting characteristics that are not well-suited for one-shot prompting, the current design of most LLM interfaces. Moreover, generating text alone is not enough; the ability to count characters is also needed to complete this task successfully.

At first, Claude insisted it could accurately count the characters and words, but after a few failed attempts, it admitted it didn’t have that ability. That’s when things got interesting. Claude suggested, “I’ll shorten the text in a few areas. Could you copy it into a word processor and tell me the size and I shorten it more if needed”.

Now, I had mentioned using a word processor earlier in our conversation, but I was still surprised that Claude used that information to come up with a solution. It was almost like it was using me as a tool to solve the problem at hand, utilizing my access to a word processor to compensate for its own limitations.

This got me thinking about how we interact with LLMs. In my experience, treating them as human-like intelligences seems to give me the best intuitions on how to unlock their abilities. LLMs are at their best when used to enhance uniquely human abilities, because they are better, faster and don’t experience fatigue. And just like us, when a task is better suited for a tool, they should delegate to that tool.

Interestingly, when I talked to Gemini, the smarty pants AI from Google, it was able to give me an exact word and character count, but only after getting it wrong the first time. It even presented the answer in a neat little table. When I asked how it did it, Gemini said it used an internal algorithm but wouldn’t reveal more because it was proprietary technology.

Leave a comment