How does Anthropic work with customers integrating Claude into their own products?

Anthropic helps customers find opportunities — both user-facing and internal — and coaches them on prompt engineering. Robin AI's legal-contract-review sidebar is a concrete example, with a claimed 10x improvement in the time lawyers spend per contract, leveraging Claude's strength with long context windows.

Podcast

Podcast: Learn How Anthropic Designs for AI — Joel Lewenstein, Head of Product Design, Anthropic

Q: How do you plan a design roadmap when AI capabilities change every few months?

You don't plan in the classic multi-quarter sense. Joel uses a football-playbook model: a wide set of contingency features that get situationally deployed based on what new model capabilities unlock.

Q: Is chat the final interface for AI, or are there other mediums coming?

Chat is great for the free-form opening move but weak at intermediate steps. Forty years of GUI craft still matters; structured, parameterized prompts (Joel cites Respell as a compelling direction) are a key next frontier.

Q: What is Anthropic's Responsible Scaling Policy?

A public pre-commitment to safeguards that scale with model capability. It's a contract Anthropic holds itself to, and hopes others adopt, as frontier models get more powerful.

Q: What is Collective Constitutional AI?

A research project where Anthropic worked with a public polling firm to collect values from American citizens, then trained a Claude model against that collective constitution — opening a path to customize AI to any declared set of values.

Q: How do design leaders prove their value when AI is shrinking design teams?

Optimize for cross-functional credibility, not internal polish. Hire low-ego, high-collaboration people, and use the 'happy hour test' — what do your PM and eng peers actually say about your team after two drinks? Take it seriously.

Q: What's Joel's framework for scaling a design team inside a fast-moving AI company?

Anchor design headcount as a ratio to engineering and PM so every cross-functional pod has a designer shipping software, and invest early in user research to learn rapidly in a world with lots of unknowns.

Explore AI's impact on design with Joel Lewenstein from Anthropic. Learn how design teams can demonstrate their value and adapt to AI.

Adam Perlis

Feb 12, 2024 — 14 min read

In this episode, we sit down with Joel Lewenstein, Head of Product Design @ Anthropic. In this episode, we discuss how AI is changing the way we design and how teams can prove their value to the organization they support.

Guest: Joel Lewenstein, Head of Product Design @ Anthropic
Host: Adam Perlis, CEO at Academy

🔴 Listen on YouTube:

🎧 Listen on Spotify:

Resources mentioned in the episode:

Responsible Scaling Policy
Constitutional AI
Collective Constitutional AI

Joel Lewenstein is the Head of Product Design at Anthropic, where he leads design for Claude and the company's API products. Before Anthropic, Joel led design teams at Airtable, Quora, Hustle, and GoodGuide, and has spent his career looking for the overlap between great software craft and problems at the base of Maslow's hierarchy — health, safety, and economic thriving.

Summary

Key Takeaways

Designing with real data and actual production environments is meaningfully better than designing with static Figma mockups — especially for AI products, where the magic is in the unpredictability of the model's response, not the shell around it.
At Anthropic's stage of AI, long-term roadmaps are a trap. Joel runs the product design team with a 'football playbook' mental model: a wide set of ready features that get situationally deployed based on what new model capabilities unlock.
The Anthropic product design team is just three designers for a company with outsized public impact. Scaling means hiring hard, but also anchoring ratios to engineering so every small cross-functional pod has a designer, a PM, and engineers.
Chat is great for the free-form 'brain dump' opening move but weak at intermediate steps. Forty years of GUI craft still matters — structured inputs, parameterized prompts, and small composable building blocks (Joel namechecks Respell as a compelling direction) are the next design frontier.
The 'raise the ceiling vs. lower the floor' framing Joel learned at Airtable is how he thinks about AI product bets: you can either make power users more capable or make new users able to do things they never could before — both need design.
Design leaders prove their value in AI-era orgs by winning cross-functional credibility, not by showing up with prettier Figma files. Joel uses the 'happy hour test': what do your PM and eng peers actually say about your team after two drinks?
'Design is slow' is a stereotype with some merit. More senior designers work faster, and design leaders have a lot of leverage over cycle time by asking 'what's the next checkpoint, and what if we did it a week earlier?'
Anthropic's Responsible Scaling Policy is a public pre-commitment to safeguards that scale with model capability. It's the backbone of how the company thinks about red-teaming, constitutional AI, and customer deployments like Robin AI's 10x legal-contract-review sidebar.

FAQ

Frequently Asked Questions

› How does Anthropic design for AI products differently than a traditional SaaS product team?

Joel Lewenstein says the core shift is designing with real data and real model outputs instead of static mockups. 'We can make seven Figma screens where you press some buttons and the model responds, but we've just typed out what it will say — that doesn't capture the failure modes, and it doesn't capture the peeking-behind-the-curtain moment when the response is really good.' Anthropic's team ships working prototypes into real environments and does research on top of that, rather than doing heavy research up front against mockups.

› How big is Anthropic's product design team?

At the time of recording, just three product designers for the whole product organization, and actively hiring more. Joel inherited one designer when he joined — Kyle Turman, whom he describes as 'an extraordinary individual, one of these can-do-it-all from branding to writing code to product strategy' and credits with laying down most of the foundational patterns the team now builds on. Research and policy are much more mature at Anthropic than product, which Joel frames as both a joy and a struggle.

› How do you plan a design roadmap when AI capabilities change every few months?

You don't, at least not in the classic multi-quarter sense. Joel's approach at Anthropic is 'extreme humility about strategy and long-term planning' and a contingency-driven, if-then-oriented portfolio. 'It's almost like a football playbook where you're situationally deploying different things. If the next generation of models has this capability, we have an amazing set of features we're excited to do. If it turns out users aren't able to access that capability, we have a different set.' He contrasts this with Airtable, where 18-month bets were viable, and says the short-sprint reactivity is simply required in AI.

› Is chat the final interface for AI, or are there other mediums coming?

Joel thinks chat is wonderful for 'the free-form, I'm going to brain-dump all the things I want to happen' opening move, but much weaker for intermediate steps. 'We've spent 40 years making graphical user interfaces — I can't imagine none of that is relevant in the future.' He points to structured, parameterized prompts as an interesting direction, calling out a small company called Respell that lets you turn a big prompt into a black box with typed inputs, so the user just fills in the variable part. Expect a lot more GUI layered back onto chat.

› What is Anthropic's Responsible Scaling Policy?

It's a public pre-commitment to safeguards that scale with model capability — essentially a contract Anthropic holds itself to, and hopes others adopt, as frontier models get more powerful. Joel describes it as the backbone of how the company thinks about safety across both current-generation risks (accidental misuse, scams, jailbreaks) and forward-looking concerns around agentic AI and alignment. He points readers to the RSP itself as the best resource on specific techniques.

› What is Collective Constitutional AI?

A research project Anthropic ran where, instead of using a single constitution written by a small group, they worked with a public polling firm to collect values and principles from American citizens, then trained a Claude model against that collective constitution. Joel frames it as a democratization and customization opportunity: once you have a technique for aligning a model to any declared set of values, companies, institutions, and communities can start training AI against their own constitutions.

› How does Anthropic work with customers who are integrating Claude into their own products?

Two fronts. First, helping customers spot the opportunities — both user-facing (embedding AI into their existing UIs) and internal (worker tools, efficiency). Second, coaching them on prompt engineering and the actual strengths and limits of the models. Joel gives Robin AI as a concrete example: a legal-tech company that built a sidebar for contract review inside a text editor and, by leveraging Claude's strength with long context windows, claims roughly a 10x improvement in the time lawyers spend per contract.

› How do design leaders prove their value when AI is shrinking design teams?

Joel's answer is unusually direct: optimize relentlessly for cross-functional credibility, not internal polish. Hire low-ego, high-collaboration, high-productivity people. Apply the 'happy hour test' — after two drinks, what do your PM and eng peers actually say about your team? Take the feedback seriously even when it stings. And combat the 'design is slow' stereotype by asking tighter checkpoint questions ('what's the next milestone, can we do it a week earlier?'). 'Design leaders need to be business leaders too.'

› What's Joel's framework for scaling a design team inside a fast-moving AI company?

Two anchors. First, map design headcount as a ratio to engineering and PM, so that every tightly-knit cross-functional pod has a designer shipping software alongside the engineers. Joel doesn't try to predict the products 12 months out; he predicts where engineering is going and follows. Second, invest early in user research, because AI has so many unknowns that rapid learning is the bottleneck, not throughput.

› Where can I learn more about Joel Lewenstein?

Joel writes occasional blog posts on design and AI at joellewenstein.com. For Anthropic's research and policy work, anthropic.com publishes papers and announcements regularly — Joel recommends both the Responsible Scaling Policy and the Collective Constitutional AI write-up as starting points.

Transcript

Full Transcript

› Read the full conversation transcript

Designing with real data and with actual production environments is just so much better than designing with static mockups. I've lightly felt that in previous experiences, but for anyone who's tried any of these chatbot experiences — it's the magic of the response that makes it so compelling, the personalization and the two-way dialogue. We can make seven Figma screens where you press some buttons and the model responds, but we've just typed out what it will say. That doesn't capture the failure modes, and it doesn't capture the peeking-behind-the-curtain moment where you're like, oh my God, this response is really good, that really understood me.

Adam: Hey everybody and welcome to How We Scaled It for Design Teams, the show that explores the journey through the arduous road of growing a successful design practice. I'm your host Adam Perlis, CEO and founder of Academy UX. Today I have the pleasure of speaking with Joel Lewenstein, the Head of Product Design at Anthropic. We're going to be covering how Anthropic designs for AI, how they work with their partners, and how to prove the value of design to the larger organization.

Joel: I've tried to balance two things in my career. One is I just love making software — I have this memory of the first time I wired up jQuery on a website to press a button and something happens, and there's a magic there that's never really left me. On the other side, I really want to be a small part of solving some of the world's biggest problems, and I like to go really low on Maslow's hierarchy — health, safety, security, economic thriving. A lot of the world's biggest problems aren't bottlenecked by UX, so I've been seeking the overlap. I started at GoodGuide, spent years at Quora, did political tech at Hustle, and then a number of years at Airtable before finding myself at Anthropic — which is a hybrid of a research company, a policy shop, and a product startup.

Adam: Tell us a bit about leading the design team at Airtable.

Joel: Many org structures over the years. Airtable is almost like an SDK — a spreadsheet-looking database, an automations platform, a bunch of UI components that developers put together to make apps. I was running design on the end-user side. I also started as an IC and built the first version of the automations platform, so over four years I saw the company from really different angles. I think that leads to a non-parochial way of seeing your work: this org structure is just what fits the moment, and you're not too wedded to your team's particular OKRs.

Adam: How did you land the job at Anthropic?

Joel: 2023 really showed me the promise of the technology. I care a lot about values-aligned companies, and I think a lot of employers couch their reason for existing in high-minded mission terms — not all missions are created equal. At Anthropic it's real: there's evidence in the policy papers, in the interviews executives give. I got a little window into the AI product side at the tail end of my Airtable time, and I've now been on both sides of a foundational-model API and a verticalized tool consuming one. I also love scaling companies — the hypergrowth period — and certainly Anthropic is in that mode.

Adam: How big is the team now?

Joel: Ruthless prioritization. We have three product designers right now and we're hiring more. The research and policy wings are really mature — they've been working for years. The product team is more nascent. There was one designer there when I started, Kyle Turman, who's just an extraordinary individual — one of these people who can do it all, from branding to writing code to product strategy. Kyle laid down a lot of the foundational way of thinking and the early patterns that we're now trying to use as fast as we can.

Adam: What's really changed about your approach to product design given these new AI paradigms?

Joel: I hope this doesn't sound too 'I told you so,' but I feel even more militantly about things I had an inkling on previously. First: extreme humility about strategy and long-term planning. This technology is changing so fast it's hard to predict. Airtable was 18-month bets. Anthropic cannot be that. Short sprints and reactivity are completely necessary. The chess-board-and-rice intuition pump for exponentials is well-worn but true: by the time you get to the 64th square, there are 18 quintillion grains of rice, more than have ever existed on Earth. Stuff like that really challenges your ability to know what your product needs three, six, twelve months from now.

What that's led us to do is be a lot more contingency and if-then oriented. We have an astonishing number of ideas across a really wide possibility space, rather than sequencing them out like we're definitely doing X, then Y, then Z will stack on Y. It's almost — I hate sports metaphors, but — like a football playbook: you're situationally deploying different things. If the next generation of models has this capability, we have an amazing set of features we're excited to do. If users really aren't able to access a particular model capability, we have a different set.

Adam: So research takes center stage — you put things out quickly and test with real people.

Joel: You've elegantly teed me up for another lightly held belief that's now strongly held: designing with real data and actual production environments is just so much better than designing with static mockups. We can make seven Figma screens where you press buttons and the model responds, but we've typed out what it will say. That doesn't capture the failure modes — this thing didn't do what I want, or it's kind of dumb. It also doesn't capture the peeking-behind-the-curtain moment where the response is so good it really understood you. We're trying to put stuff out in the world, do a lot of research, be very user-centric — but once users have their hands on an actual working thing, not a mockup.

Adam: Are there other mediums beyond chat you're thinking about?

Joel: At Airtable we had this phrase: are you raising the ceiling of your capabilities or lowering the floor of people who can access it? You're asking about raising the ceiling. I think chat, language-based interface, has a lot of limitations. It's wonderful for the free-form opening move — I'm going to brain-dump all the things I want to happen, and then we'll go from there. I don't think it's as good for intermediate steps. We've spent 40 years making graphical user interfaces — I can't imagine none of that is relevant in the future. One direction I find interesting is a little more structure and parameters. Prompts right now are text blobs, but they're really system instructions plus almost variables or input parameters to a function. A small company called Respell is doing this — you write a big prompt, refactor the inputs out, and package the rest into a black box most people don't have to worry about. That's a really compelling direction.

Adam: How do you help other companies integrate AI into their products?

Joel: Two fronts. One is understanding where the opportunities are — improving existing products, embedding text interfaces into customer-facing UIs, internal efficiency and worker-happiness tooling. Two is prompt engineering, which is still the dark art of this space. And really understanding the limits and strengths of the technology. One example I like a lot is Robin AI — they're helping contract review go faster. Lawyers spend a ton of time combing through contracts, taking the human desire for a change and finding the place in the contract to make it. Robin built a sidebar inside a text editor using our API. They've gotten about a 10x improvement in the time lawyers spend per contract. Claude is really good with long context windows — long documents — so there's a nice example of our technical strength married to a clear customer pain point.

Adam: Talk about the safety side — constitutional AI, responsible scaling.

Joel: The mission is to create safe and reliable AI. One thing we've done that I think is really fantastic is the Responsible Scaling Policy — essentially pre-committing to safeguards we'll hold ourselves to, and hope others hold themselves to, as the models get more powerful. We do a lot of work on the current generation to make sure they're safe and adhere to the values we've embedded, and we want to think ahead. We're a pretty active policy shop and work with a lot of other players to set standards.

On constitutional AI: it's a way of training an LLM where you have a set of values and principles — we based ours largely on UN human rights principles, parts of Apple's privacy policy, and a few other things. The obvious problem is that it's just one constitution written by comparatively few people. So in the fall we ran a project called Collective Constitutional AI where we worked with a public polling firm to collect values and inputs from American citizens, then did the same training run. If you think of constitutional AI as partly our view of what safety means but also partly a technique for aligning a model to any declared set of values, that opens up real democratization and customization opportunities.

Adam: How do you think about scaling the team?

Joel: I ask myself that almost every day. Different lenses lead to alarmingly different answers. I like to anchor design ratios to engineering and PM. I don't know what all the teams will be working on three, six, nine, twelve months from now, but I know we want small tightly-knit EPD trios working on a problem and shipping software. Putting things out in the world is the best way to learn. Building out user research is the other one — there are so many unknowns about this world that we need to invest in learning really rapidly.

Adam: Design teams are getting smaller. How do design leaders continue to prove their value?

Joel: At Airtable we tried to optimize for cross-functional credibility. In the hiring phase especially, we tried to bring on low-ego, high-collaboration, high-productivity people. That paid dividends — we had a pretty low designer-drama quotient, which is helpful for credibility. I think about what I call the happy-hour test: when you're out with your cross-functional peers, outside the weekly status meetings, and everyone's had one or two drinks and they're like 'come on Joel, level with me, what is going on with your team?' — then they give you the real talk of how you're perceived. Take that seriously.

There are tropes. The first one I hear all the time is 'design is slow.' There's some merit to that. Designers care a lot about craft and not as much about cycle time. More senior designers work faster, and design leaders have real leverage — you're seeing work periodically and can ask what's the next checkpoint, what if we did it a week earlier, two weeks earlier. You can make change there and combat one of the pernicious but not entirely untrue stereotypes about design teams.

The classic way to justify a team is OKR hits — the design team contributed to the project that moved the metric. But the thing I've thought less about and now think more about is internal ROI and throughput. From an executive's point of view, you're putting humans against a problem and outcomes come out. The speed with which product comes out the door — even product that doesn't always move metrics — means you're learning faster, you're more reactive, and that's independent of top-line KPI movement. Design leaders need to be business leaders too. A lot of what we do is bridge the gap between hitting business numbers and the soul of the craft.

Adam: Where can people find you?

Joel: I'm a longtime listener and first-time guest, so thanks for having me. For Anthropic-related stuff, anthropic.com — we publish a lot of our research and thinking on AI safety. For me personally, my website is joellewenstein.com. I write the occasional blog post on design and AI, would be honored if folks wanted to find me there.