Notes on: He, X., Pan, J., Jin, O., Xu, T., Liu, B., Xu, T., Shi, Y., … (2014): Practical lessons from predicting clicks on ads at facebook

• $$\mathbf{x}$$ is the feature-transformed ad impression
2.1TODO Does it matter if we use $$\{ -1, 1 \}$$ or $$\{ 0, 1 \}$$ for labels?
• In the paper they use the sign of $$y$$ to provide the gradient, so $$-1$$ would the negatively weight the negatives samples