Engaging vs. Engagement

Apr 9

Co-written with Claude Opus 4.6

When we describe something as “engaging” we generally mean that it feels compelling in a valuable way, that it provides a clear connection to our goals and values, or that it helps us achieve a flow state of working. Creating an engaging product is a valuable and generally laudable goal, and underlies a lot of business decisions in technology companies. But what makes something engaging to a particular person in a particular moment is idiosyncratic and contextual in ways that aren’t possible to measure, or even possible to describepredictively for populations of hundreds of millions of different users.

I believe new tools can meaningfully reshape and improve the way we describe and evaluate tool engagement—but using these new tools requires a radical reframing of how to do evaluation, and who has agency as part of that process.

Engage

While there’s a poetic touch to the contrast between engagement as normally understood, and engagement metrics as discussed in business, I don’t think there is serious deception happening in the usage. Engineers understand that metrics like retention and click through rate are noisy proxies for actual user value. Even executives and press aren’t likely confused about the difference—they simply know that the metrics are the only proxy that is available. This does result in myopic, Goodhart style over-optimization.

Statistics

While it is easy to lie with statistics, it is even easier to lie without them.

Attributed to Frederick Mosteller in Murray, Charles (2005) . "How to Accuse the Other Guy of Lying with Statistics"

One alternative to engagement metrics is the HiPPO effect where the Highest Paid Person’s Opinion reigns supreme. While I’m a big fan of a lot of auteur-driven media like Community or Babylon 5, there are obvious biases that make singular leaders single sources of failure in many contexts, and a project being auteur-driven doesn’t make it good.

We use metrics and statistics to force ourselves to be grounded in reality. No metric is perfect, and statistics allow us to account for large amounts of variance. Blood pressure has this problem—healthy blood pressure varies by person, by context, and by measurement conditions. It still exists and functions as a better tool than a doctor’s first-glance judgment. And using a variety of metrics grounded in healthy UXR practices and expert judgment does help improve products.

But user engagement as a target is unusually idiosyncratic. A single user on a shopping webpage looking for cotton swabs probably prefers a 0-click experience, where the swabs appear on the doorstep just as they would have run out. Whereas the same user on the same site searching for summer fashions may want to browse and click for hours without making a single purchase. And click actions are so minor, the potential to game them is much broader. A Facebook post might get me to click on it by having an attractive person or an argument so stupid I have to read it—but neither of these cases is actually delivering value to me. Contrast this with blood pressure, where every human needs some pressure to make sure blood moves from their toes up to their head, but not so much that their veins rupture. And because of the idiosyncrasy of engagement, there isn’t even a ground truth in principle for many engagement metrics.

Web 2.0

In many important contexts, software needs explicit user stories before it’s possible to evaluate value for the user. But user inputs have been central to web design for over 20 years, and if the fate of Web 2.0 has taught us anything it’s that people don’t know what they want and can’t communicate it, and no one would give it to them anyway. So how can we reduce the gap between features implemented in software and the process of genuine user engagement that makes that software valuable?

Ubiquitous Attention Infrastructure

If you had an army of human servants available every moment of your day, it wouldn’t be hard for them to tell what was valuable for you, or to cater to your true values. We don’t have that. What we do have is LLM Chatbots waiting for us to bring up any subject at all we could ever think of. This lack of structure can make it difficult to engage, or easy to go in “wrong” directions—especially directions from which LLMs can’t offer real value. But the tools do have the capacity to scale our ability to pay attention. This is incredibly valuable! And in a world with a functioning community for soloware, that is, where everyone can connect to peers to design their own tools and interfaces, it’s possible to break down some of the flawed core assumptions of central developer aggregation of “engagement metrics.”

If everyone has the option to track their own metrics, using attentive assistance to accurately recognize their own values, that enables a much stronger version of a retention metric: the success and value of a product is in how much different users take and utilize parts of the software for their own idiosyncratically defined notion of value.

This community market of partial adoption would have much stronger trust mechanisms than current engagement metrics. Value mismatch is way more visible, because it’s evaluated by the primary source. Verification is partially mechanical, because ubiquitous attention bridges the personal attention gap. Markets allow many independent parties to participate and have input. Specifications are not precise, and human judgment still plays an important role. This isn’t a be-all end-all solution to product design. But it’s a radical step forward, and it’s a valuable step for building a future where individual people are centrally valued participants.

AIalignmentLLMslive theory

magfrump .

Engaging vs. Engagement

Engage

Statistics

Web 2.0

Ubiquitous Attention Infrastructure

Reasons to be Worried About AI

All Alignment Research is Capabilities Research

Questions? Say hi.