吴良超的学习笔记

《Modeling Delayed Feedback in Display Advertising》阅读笔记

发表于 2020-05-17 标签计算广告，机器学习

在计算广告中，转化是有延迟的，即在点击发生后过一段时间用户可能才会发生转化，且往往转化漏斗越深，延迟的时间越长；因此在训练 cvr / deepcvr 模型时，会有两种情况出现（1）过早把样本送入模型，把最终会转化但是还没回传 label 的事件当做负例，导致模型低估（2）过晚把样本送入模型，即让所有样本都等待一个足够长的时间才送入模型，导致模型没能及时更新

因此在建模时需要对转化的回传延时进行建模，这篇 paper 《Modeling Delayed Feedback in Display Advertising》是 criteo 针对这个问题提供的一个解决方法，主要思想就是对于还未观察到 conversion 的样本，不直接将其当做负样本，而是当前考虑 click 已发生的时间长短给模型不同大小的 gradient；paper 里称在 criteo 的真实的数据上验证了该方法的有效性。此外，文章从问题的建模到求解的思路不错，值得一看。

阅读全文 »

Distillation 简介

发表于 2020-03-01 标签机器学习

本文简单了描述机器学习中的蒸馏（distillation）技术的原理，distillation 可简单分为 model distillation 和 feature distillation。顾名思义，蒸馏是对原来的模型 / 特征进行了压缩，其原因可能是为了减少模型的大小（model distillation）、或者某些特征只能在 training 时获取，serving 无法获取 (feature distillation)；在实际业务中可根据具体场景灵活地应用这两类技术。

阅读全文 »

Introduction to Distillation

发表于 2020-03-01 标签机器学习

This article briefly describes the principles of distillation in machine learning. Distillation can be simply divided into model distillation and feature distillation. As the name suggests, distillation compresses the original model/features. The reasons may be to reduce model size (model distillation), or because certain features are only available during training but not during serving (feature distillation). In practical business, these two techniques can be flexibly applied according to specific scenarios.

阅读全文 »

《Smart Pacing for Effective Online Ad Campaign Optimization》阅读笔记

发表于 2019-10-03 标签计算广告

《Smart Pacing for Effective Online Ad Campaign Optimization》是 Yahoo 在 2015 发表的一篇关于 budget pacing 的论文，与之前写过的 Budget Pacing for Targeted Online Advertisements at LinkedIn 相似，目标也是把预算均匀花完，但是除了这个目标，这篇论文还提出了在预算均匀花完的基础上如何保成本的方法，算是一个多目标优化了。在离线环境和真实环境验证了方法的有效性，是实践性较强的一篇文章，值得一看。

阅读全文 »

Bayesian Optimization Methods for Hyperparameter Search

发表于 2019-07-10 标签机器学习

Machine learning involves numerous hyperparameters—model hyperparameters, optimizer hyperparameters, loss function hyperparameters, etc. Users need to set these based on experience and adjust according to training results. Optimal values vary by task and dataset, with no universal empirical values.

This step is often tedious and time-consuming. To simplify, Hyperparameter optimization research aims to automatically search for optimal hyperparameters. Common methods include grid search and random search, with more advanced methods like heuristic search and Bayesian optimization. This article introduces Bayesian optimization for hyperparameter search—a common approach also provided as a service by Google on Google Cloud. We focus on the GPR (Gaussian Process Regression) + GP-BUCB (Gaussian Process Regression-Batch Upper Confidence Bound) method.

阅读全文 »

超参搜索中的 Bayesian Optimization 方法

发表于 2019-07-10 标签机器学习

机器学习中存在着众多的超参数，如 model 中的超参，optimizer 中的超参，loss function 中的各种超参等，这些超参需要使用者根据经验设定，并根据训练结果进行调整，因为这些超参的最优值跟不同任务、不同数据集相关, 没有一个非常通用的经验值。

这一步骤往往繁琐耗时，为了简化这一过程，有了 Hyperparameter optimization 的研究，其目的是自动搜索最优的超参。超参搜索最常见的方法是 grid search，random search，当然也有更高级的方法如基于启发式方法的 heuristic search、基于统计学的 bayesian optimization 等，本文主要介绍超参搜索中的 Bayesian Optimization 方法，这是超参搜索比较常见的做法，Google 也将这部分作为一个 service 提供在 Google Cloud 上。本文主要介绍 Bayesian Optimization 中的 GPR (Gaussian Process Regression) + GP - BUCB (Gaussian Process Regression-Batch Upper Confidence Bound) 方法。

阅读全文 »

Markdown 图片免费上传工具

发表于 2019-04-20 标签工具使用

对于习惯用 markdown 写作的人，日常最烦恼的问题之一应该是如何显示自己的图片，markdown 文本存储的是图片的地址（本地路径或 url），而最为常用的是 url，至少我写文章是这样的。某些 markdown 编辑器也提供了 markdown 图床服务，如有道云笔记，cmd_markdown 等, 但是这些编辑器要么太丑（有道云笔记，说的就是你），要么就是比较小众（cmd markdown），生怕哪天停止运营了文章里的图片就没了，而且这些服务一般是要收费的。

那么有没有一种方法能够为本地图片生成 public url，同时保证数据有较高的可用性，而且最好是免费的。Github 其实已经间接为我们提供了这样的服务，只是这个步骤较为繁琐，本文就是针对这一点开发了一个小工具来简化这个过程，代码已开源，见 MarkdownImageUploader，本文主要介绍其基本原理和使用方法。

阅读全文 »

Deploying Machine Learning Models with Flask, Docker, Jenkins and Kubernetes

发表于 2019-04-19 标签机器学习，工具使用

This article introduces an automated approach to deploying machine learning models using Flask, Docker, Jenkins and Kubernetes. The basic principle: Flask provides RESTful API to receive client prediction requests; Docker packages the service into a docker image for easy deployment and migration; Jenkins triggers automatic builds when code or models are updated; Kubernetes manages containers for scalability and reliability. This article is based on Deploy a machine learning model in 10 minutes with Flask, Docker, and Jenkins with improvements and extensions, such as a simple shell script to trigger Jenkins and Kubernetes deployment instructions. All code is available at DeployMachineLearningModel.

阅读全文 »

通过 Flask, Docker, Jenkins 和 Kubernets 部署机器学习模型

发表于 2019-04-19 标签机器学习，工具使用

本文主要介绍部署机器学习模型的一种自动化方式，如题所示，通过 Flask，Docker, Jenkins 和 Kubernets 实现。基本原理就是通过 Flask 提供 RESTful API 接收客户端的 predict 请求，然后将这个服务打包成一个 docker image 便于部署和迁移，当代码或模型更新时通过 Jenkins 触发自动构建新的 docker image，而通过 kubernets 管理容器则让整个服务具备伸缩性和可靠性。本文主要参考了 Deploy a machine learning model in 10 minutes with Flask, Docker, and Jenkins，并在其基础上进行了完善和拓展，如通过一个简单的 shell script 实现 jenkins 的触发功能，并添加了 kubernets 部分的介绍等。本文的对应的所有代码可从 DeployMachineLearningModel 获取。

阅读全文 »

Leetcode 解题报告 (496, 975, 503)-next greater / smaller element

发表于 2019-03-25 标签树，动态规划，栈

本文主要介绍在 LeetCode 题目 496. Next Greater Element I、975. Odd Even Jump、503. Next Greater Element II 中需要解决的共同问题：next greater element，就是对于一个数组中的每个 element，求出下标和值都比其大的一个 element，根据要求不同，这个问题又可分为 nearest of next greater elements 和 smallest of next greater elements，前者指的是 next greater elements 中离当前 element 最近的那个，后者指的是 next greater elements 中值最小的那个。两个问题都可通过 stack 解决，后者也可通过 treemap 解决。最后会将原来的问题进行的拓展，将原来的数据改成头尾相接的，其解决方法是将来的数组进行 duplicate, 然后把环解开，详细请看后文。

阅读全文 »