User Experience Optimization: From Heuristic Intervention to Unified Value Modeling

Posted on 2025-12-07 标签计算广告 , 机器学习 , 推荐

In the evolution of Search, Ads, and Recommendation systems, User Experience (UX) is an unavoidable core challenge. In this article, we quantify UX as LT (Life Time), specifically referring to user retention (e.g., the number of days a user opens the app within a 7-day window).

Unlike pure content recommendation, which seeks to maximize total watch time, optimization in commercial or marketing-oriented sectors (Ads, E-commerce, Live Streaming) is about maximizing business value (Cost, GMV, etc.) while staying above a “UX Redline.” More accurately, it is about maximizing the efficiency of exchanging “Unit LT” for “Business Metrics.”

We typically use Holdout Experiments (Reverse Experiments) to measure how a business strategy affects LT. The factors influencing these results are multifaceted:

Explicit Factors: These are the most direct and well-known, involving the position and density of business items (e.g., start_pos, gap, load).
Implicit Factors: Supply quality and ranking accuracy. The quality and diversity of ad creatives or live-streaming content determine the appeal of the distribution queue. If supply is insufficient or ranking is inaccurate, users are less likely to be attracted to the platform.
Opportunity Cost (Backfill Logic): A frequently overlooked point. A holdout experiment compares the “Business Queue” with a “Backfill Queue” (usually the organic recommendation queue). The final impact on LT is essentially:

\[\Delta LT = LT_{Business} - LT_{Backfill}\]

The negative impact on LT is minimized only when the business content is as attractive as—or more attractive than—the organic content it replaces.

While improving supply and ranking accuracy are the fundamental drivers for LT, they take time to yield results. In day-to-day engineering, adjusting load, start pos, and gap remains the most immediate lever. Furthermore, we must establish strict defensive mechanisms to prevent short-term gains from masking long-term, cumulative UX damage.

This article explores the evolution of UX optimization through three stages: Short-term Defense (Heuristic Protection), Mid-term Tuning (Experience Modeling), and Long-term Alignment (Unified Value Modeling).

Short-term Strategy (Heuristic Defense & Protection)

In the early stages, the most effective tools are rule-based strategies based on traffic attributes or user segments. The core logic is “Rapid Loss Prevention” and “Layered Defense.”

By manually or semi-manually “patching” the system (e.g., raising thresholds for specific segments), we protect high-risk or sensitive users. This includes protection for new users, returning users, or low-activity users. In search, it might involve setting different thresholds for “active intent” vs. “passive browsing” traffic.

Implementation Logic

We segment traffic by request features (e.g., channel entry) or user profiles (e.g., historical report/bounce rates). Mechanistically, we set independent, higher eCPM thresholds or increase start_pos/gap losses in the ranking stages to ensure business content is reduced or removed for these segments.

Pros: Simple to implement, low experiment cost.
Cons: Purely reactive (post-hoc), lacks generalization, and can easily become obsolete.

The Paradox: Why do these “patches” work?

If a system has a Static Threshold, it has theoretically fixed the exchange efficiency between business goals and LT. For example, in ads, a fixed eCPM threshold dictates that a “show” must bring in at least \(X\) revenue to justify the LT loss.

In an ideal world, a single static threshold should be globally optimal. However, these patches work because of three realities:

1.Value Estimation Bias: The system may overestimate the eCPM for certain groups (like new users). Raising the threshold acts as a manual calibration for this overestimation.
2.UX Loss Heterogeneity: Even if value estimation is accurate, the LT loss from the same “show” varies by user. A sensitive user might churn after one bad ad, while a resilient user remains unaffected.
3.Suboptimal Exchange Efficiency: Even with perfect predictions for value and experience, the “weight” given to each in the ranking formula might not be globally optimal.

However, while patches are necessary today, they shouldn’t exist permanently. The goal is to improve the accuracy of both value and experience modeling and optimize their exchange ratio.

Mid-term Strategy (Experience Signal Modeling)

Heuristic strategies rely on the assumption that certain groups are sensitive. Mid-term optimization moves away from manual interventions toward dynamic, model-driven interventions.

We explicitly model the user’s experience loss.We can categorize these modeling approaches into two types:

1. Direct Experience Signal Modeling (Correlation)

Offline: Predict the probability of negative behaviors (leave, report, dislike) after a given exposure: \(P(Negative | Context)\).
Online: Based on the predicted probability, dynamically increase thresholds or gaps to reduce “Load” for that specific request.

2. Uplift Modeling (Causality)

Offline: Use causal inference to model the change in LT (\(\Delta LT\)) and the change in business value (\(\Delta Cost\)) when a “Treatment” (showing a business item) is applied.
Online: Calculate the Marginal Exchange Efficiency

\[Efficiency = \frac{\Delta LT}{\Delta Cost}\]

We select traffic and users with the highest efficiency to perform “LT recovery” (reducing load), maximizing the LT regained for every dollar of revenue sacrificed.

However, uplift modelling has the following challenges:
(1) Label Sparsity: LT is a long-term, sparse metric. We often rely on Proxy Metrics (e.g., short-term stay time, interaction), and the correlation between these proxies and true LT determines the model’s success.
(2)Counterfactual Data: Uplift models require “Treatment” and “Control” data. This requires a small portion of “exploration traffic” where ads are withheld, which can be costly.

Long-term Strategy (Unified Value Modeling)

The ultimate goal is to move away from “patching LT” and instead treat experience signals as a “Unified Currency” that is interchangeable with business metrics (Revenue, GMV).

We no longer treat LT as an external filter but as an internal cost/benefit within the Ranking Formula, enabling automated end-to-end optimization.Deriving the Optimal Ranking Formula

Most ranking problems can be formalized as a constrained optimization problem. For an ad queue:

\[\begin{align} \max_x &\sum_{i=0}^{n-1} ecpm_i \\ s.t. \frac{1}{n} &\sum_{i=0}^{n-1} lt_i \ge LT^{*} \end{align}\]

Using the Lagrangian Dual, we can derive the ranking score, similar derivation process can be found here 《搜索相关性：从建模到排序机制》

\[Score_{i} = ecpm_i + \lambda \cdot (lt_{i} - LT^{*})\]

Here, \(\lambda\) acts as the “Shadow Price” of LT—the exchange rate between LT and eCPM. If a user’s predicted \(lt_i\) for a specific request is high, the final \(score_i\) increases, making the item more likely to win.

There are several key considerations in this method

Proxy Metrics: Since direct LT prediction is difficult, we use easily observable process metrics as proxies.
Listwise Context: The best place for this prediction is the Evaluator/Rerank stage, where the model has the most context to predict how a specific sequence of items affects LT.
Solving for \(\lambda\): \(\lambda\) is not static. It can be solved via offline replay of historical data to find the global optimum or adjusted in real-time via a PID Controller to ensure the LT constraint (\(LT^*\)) is met.

Though the key considerations are described briefly here, each part actually requires detail investigation in order to solve the problem properly

Conclusion

User Experience optimization in complex search, ads, recommendation is not a one-time fix but a layered, systematic challenge.

Short-term (Heuristic Defense): Focuses on rapid identification and loss prevention. It uses “patches” to protect the system when value estimation or exchange rates are not yet optimal.

Mid-term (Experience Modeling): Uses correlation or Uplift models to scientifically quantify experience. It makes hidden costs explicit and provides the data foundation for long-term integration.

Long-term (Unified Value): The final form of optimization. It breaks the duality of “Business vs. Experience” by converting UX into a “Unified Currency.” LT becomes a native value participating in automated resource allocation.

These three stages represent an evolution from Local Optima toward Global Optima. In practice, they should coexist: use short-term rules to hold the redline, mid-term models to improve efficiency, and continuously iterate toward the ideal of unified modeling.