2017 年,《Attention Is All You Need》悄然问世,带来了 Transformer 的架构,这个此后被搜广推领域广泛应用、并成为后来掀起新一轮 AI 革命的 LLM 的基石。Transformer 让机器拥有了某种 “超级注意力”:一种可以并行处理全局信息、计算序列中所有元素关联权重的强大机制。它不需要像人类一样逐字逐句地阅读和理解,而是可以一瞬间 “看到” 全部,并找出其中最关键的连接

但是一个巨大的悖论和困境正在上演。人类创造了看似拥有 “无限可扩展注意力” 的机器,但人类自身所拥有的,却是一套古老而有限的生物注意力系统。机器的核心是 “更多、更快、更全局”,而人类的核心是 “选择、聚焦、做减法”。我们正用着自己这套需要休息、会疲劳、极易分心的心智系统,去对抗一个由 “超级注意力” 算法驱动的、旨在无限捕获我们注意力的科技环境

对于碳基生物而言,更加残酷的一个事实是” Attention Is All You Have”。因为对人类来说,注意力是与生俱来、且每日额度恒定的核心生命资产。而这唯一的货币投向何处,你的生命就投资何处,最终也将到达何处。我们一直在说 “人是环境的产物”,究其原因,也许是人在某个环境中,不得不将其的注意力倾注到这个环境设定的规则里最关注的事情,进而决定了人将变为何种产物

如果说注意力是人最宝贵的财富,那如何保护自己的注意力便是一个值得探讨的命题。本文尝试对这部分展开一些探索,包括人的注意力为何是有限的,这少得可怜的注意力在当下的注意力经济中是如何被各种争夺,以及如何构建自己的注意力框架。本文是笔者最近感觉自我工作效率低下后的一次寻根问底求法的过程,祝开卷有益~

阅读全文 »

最近一段时间在研究搜索的相关性问题,一个颇有搜索特色的问题。搜索场景下的相关性,指的是展示给用户的内容,跟用户输入的 query 必须满足一定的关联关系,比如说搜 “肯德基”,就不应该出现 “麦当劳” 的内容

不同于在 feed 场景下,用户对内容基本无预期,feed 场景的推荐算法可以基于用户历史浏览兴趣、最近热点内容等做 exploit,或是通过探索用户的一些新兴趣做 explore。但在 search 场景下,用户主动搜索输入的 query 往往是有强意图的,出的内容也是要符合用户的这个预期的,否则这些搜索就是无效的,进而会造成搜索留存(LT)的损失。而在用户视角下,如果平台搜索的算法做得足够好,应该在第一页就能够找到自己想要的内容,而这这其实也导致了单 pv 下 search 浏览深度会远比 feed 要低

场景上的差异,会导致 search 相较于 feed 的优化目标也有不小差异。比如说搜索 LT 的度量中,时长并不是最重要的指标;从排序角度,增加了相关性的约束,导致特定 query 下可被用来排序的候选有限(相较于 feed),同时排序公式中往往也要加入相关性因子来达成相关性目标,对最大化原目标(如广告就是收入)的效率造成干扰

排序的相关性约束,也是导致了很多在 feed 下有效的 ranking 迭代,在 search 中效果不优甚至无效的原因,比如说这个问题里提到的现象 为什么搜索系统技术文章很少,但推荐系统技术文章很多?,一个很重要的原因是给定 query,相关性候选不足导致了 ranking 搜索空间不足,而 ranking 本身的收益应该是随着候选量增加的边际效率是递增的

本文主要探讨下搜索场景下的相关性问题的解决思路。如果粗略地划分,相关性往往会涉及到两部分:相关性的建模,以及相关性模型预估分的作用机制,本文尝试对这部分的内容详细展开做一些讨论

阅读全文 »

Recently I’ve been researching search relevance, a problem with search-specific characteristics. Search relevance means content shown to users must satisfy certain association with user-input query. For example, searching “KFC” shouldn’t show “McDonald’s” content.

Unlike feed scenarios where users have no content expectations, feed recommendation algorithms can exploit based on user browsing history, recent hot content, etc., or explore new interests. But in search scenarios, user-initiated queries have strong intent. Returned content must match this expectation, otherwise the search is invalid, causing search retention (LT) loss. From user perspective, if platform search works well, they should find desired content on first page, leading to much lower search browsing depth per PV than feed.

Scenario differences lead to different optimization objectives for search vs feed. For example, duration isn’t the most important metric for search LT measurement. From ranking perspective, added relevance constraints mean limited candidates for ranking under specific query (compared to feed), while ranking formulas often add relevance factors to achieve relevance objectives, interfering with maximizing original objectives (like revenue for ads).

Relevance constraints in ranking are why many effective feed ranking iterations are suboptimal or ineffective in search. For example, the phenomenon mentioned in Why are there few search system tech articles but many recommendation system articles? - an important reason is given query, insufficient relevance candidates limit ranking search space, while ranking’s benefit should have increasing marginal efficiency as candidate count increases.

This article discusses solution approaches for search relevance. Roughly, relevance involves two parts: relevance modeling, and the mechanism for applying relevance model estimates. This article attempts detailed discussion of these aspects.

阅读全文 »

最近一段时间很忙,忙到整个大脑带宽被打满、回到出租屋只想躺平放空,或是忍不住无意义的刷短视频;忙到觉得自己工种变了(变成了一个消防员,天天在救火);忙到甚至没有没时间去做更长期规划;能明显感觉到自己的词汇量、语义精度、表达能力和表达欲在迅速下降,同时自我对知识、对他人、对世界的好奇心和热情,似乎正在被浇灭。人身处其中的时候也许并不觉得有什么大碍,因为已经被 “体制化” 了,但一旦有更长的空闲时间,开始做 “上帝会发笑” 的行为,就能愈发感觉到了这种状态的恐怖之处,我觉得可能是时候给自己做个诊断了,于是有了这篇文章

本文主要是对最近一些经历和疑惑的碎碎念,以及试图求得其中的破解之法;涉及工作、情绪、短视频、以及自由的追求,文章极度发散,内容极度主观,就是一个情绪出口和以及写给自己的心理按摩,如果你愿意看,那祝开卷有益~

阅读全文 »

文章的源起,是在深入了解听过 n 次但选择性忽略了歌词里所诉说的故事的《远在咫尺》后,对最后那句 “一起这种艺术,若果只是漫长忍让,应感激终身的伴侣” 颇有感触,把储藏在脑子里的很多零散的观点拽出来了,只是这些观点现在也是凌乱不堪,所以尝试通过这篇文章来 connect the dots

本文尝试对 “爱情” 这一命题展开一些探讨,涵盖 “心动” 的源起、人来人往的 “投射与认同” 游戏、已经是标品的相亲流程、以及 “一起” 的这种艺术,内容依旧是随心所欲的发散,祝开卷有益~

阅读全文 »

This Spring Festival, DeepSeek pushed large language model discussion to new heights. Even in an 18-tier small town’s New Year atmosphere, unexpected technological ripples were hidden. My cousin shouts dialect at her phone “Write me a New Year greeting”, my nephew chats with Douba’s gentle, understanding “school beauty”, even the alley’s Spring couplet vendor learned to customize gold-patterned designs with generative AI - these digital ripples in daily life, like a silent enlightenment movement, wove “large language models” into this small town’s capillaries.

Two years ago AI was like a bronze giant statue, trembling with computational roar when ingesting data, only able to process stellar data in first-tier city data centers. Today’s large models have become flowing streams, infiltrating along 5G base stations into county auto repair shops’ QR scan systems, kneaded into Kuaishou streamers’ dialect scripts, even hibernating in elderly phones’ voice assistants coughing once to remind medication. From “brute force aesthetics” hundred-billion parameter arms race, to MoE architecture deftly slicing computational cake, from millions-of-dollars lab aristocracy, to DeepSeek-R1 tearing open commoner admission with $6M training cost - this evolution isn’t just technological leap, but metaphor for tech narrative shifting from “monologue from the altar” to “dialogue with humanity”: when large models learn cost-optimized ballet on GPU remains, technology’s capillaries finally touched daily life’s heartbeat.

阅读全文 »

今年的春节,deepseek 把大模型的讨论热度推向了高潮,即使是在十八线小城的年味里,也藏着些意料之外的科技褶皱。表妹用方言对着手机喊 “给俺写段拜年词”,小侄子在忙着跟豆包里的声音温柔、善解人意的 “校花” 聊天,连巷口的春联摊主都学会了用生成式设计定制烫金纹样 —— 这些烟火气里的数字涟漪,像一场无声的启蒙运动,将 “大模型” 三个字编织进了这座十八线小城的毛细血管

两年前的 AI 还像一座青铜巨像,吞吐数据时浑身震颤着算力的轰鸣,只能在北上广深的数据中心里吞吐星辰;而今的大模型已化作游走的溪流,沿着 5G 基站浸润到县城修车铺的扫码系统、揉进快手主播的方言脚本,甚至蛰伏在老人机的语音助手里咳嗽一声提醒吃药。从 “暴力美学” 的千亿参数军备竞赛,到 MoE 架构轻巧切开算力蛋糕的刀锋,从耗资数千万美元的实验室贵族,到 DeepSeek-R1 用 600 万美元训练成本撕开的平民入场券 —— 这场进化不仅是技术的跃迁,更是科技叙事从 “神坛独白” 转向 “人间对话” 的隐喻:当大模型学会在显卡残骸上跳成本最优化的芭蕾,技术的毛细血管终于触到了烟火人间的心跳

阅读全文 »

The previous mix-ranking article mainly introduced the basic approach from “Ads Allocation in Feed via Constrained Optimization” and extended the discussion to open problems in mix-ranking. That paper solves request-level insertion rules but doesn’t consider practical constraints like adload, i.e., the proportion of ads shown is limited.

When considering adload constraints, we can’t just compare values within requests; we need to maximize value across requests or at the session level. For example, showing more ads on high-value requests and fewer or no ads on low-value requests to maximize revenue under adload constraints. This paper “Hierarchically Constrained Adaptive Ad Exposure in Feeds” provides a solution approach. The paper’s overall approach still uses beam search and the generator+evaluator mix-ranking paradigm, but incorporates overall constraints into the online solving process. Compared to conventional approaches that control adload independently from mix-ranking, this is a better approach worth reading~

阅读全文 »

之前写的混排文章,主要介绍了《Ads Allocation in Feed via Constrained Optimization》里的基本做法,同时拓展讨论了混排中的一些开放性问题,这篇 paper 解决的问题是 request 维度的插入规则,但是没有考虑到一些比较实际的约束如 adload,即出广告的比例是有限的

而如果考虑到 adload 的约束,就不能只考虑 request 内的价值比较了,而是要考虑到 request 之间或者说 session 维度的价值最大化了,如在某些高价值请求上多出,低价值请求上少出或不出,以此达到 adload 约束下收入最大化。这篇 paper《Hierarchically Constrained Adaptive Ad Exposure in Feeds》为这个问题提供了一个解决思路;paper 整体做法还是 beam search 和 generator + evaluator 的混排范式,但是在这个过程中会把整体约束考虑到这个在线的求解过程中;相较于常规做法把控 adload 独立在混排之外,是一个比较好的思路,值得一读~

阅读全文 »

Recently I’ve been researching multi-channel bidding problems. Quoting Google Ads’ Power More Conversions and Value through Cross-Channel Bid Optimization:

Traditionally, advertisers have applied automated bidding to campaigns that target a single channel. For example, they might use a bid strategy that maximizes conversion value on separate campaigns for Search, Display, and Video. But there are limitations to this siloed approach. But multi-channel bid optimization can help you to drive better results compared to single-channel bid optimization by maximizing marginal CPA or ROAS in each and every auction

Simply put, when a campaign runs across more traffic positions simultaneously, budget marginal utility can be better optimized. This is intuitive - with richer traffic inventory, the same budget can theoretically achieve better efficiency. This is similar to “universal delivery” products recently launched by various domestic media platforms. These products provide lower-barrier solutions for advertisers, saving budget allocation or bid setting across channels, while platforms use algorithmic capabilities to improve budget efficiency.

From a technical perspective, multi-channel raises two questions:

  1. Is unified bidding optimal? If not, how to do per-channel bidding
  2. Should budget be explicitly allocated to each channel

The multi-channel examples above are all within one large platform, where budget and bidding across channels can be easily shared. Another common multi-channel definition is cross-platform, e.g., advertisers running on both Google and Meta, where budget and bidding clearly can’t be shared. From the advertiser’s perspective, how to optimally allocate is also worth discussing.

This article mainly discusses the former: budget allocation and bidding when running on multiple channels within the same platform. Also briefly mentions research on cross-platform scenarios.

阅读全文 »
0%