104. 104
• MAB
- Bandit Algorithm = Thompson Sampling(
- Reward = Click (with Unclick )
- Play Arms = Cluster Most Popular
- None Stationary = Exponential Decaying
• 2
- = # of clicks / # of impressions
- = # of use_coins / # of impressions
1. MAB
105. 105
• MAB
- Bandit Algorithm = Thompson Sampling
- Reward = Click (with Unclick )
- Play Arms = Cluster Most Popular
- None Stationary = Exponential Decaying
• 2
- = # of clicks / # of impressions
- = # of use_coins / # of impressions
1. MAB
Use Coin( )
MAB Reward Use Coin, Click + User Coin
by @brandon.lim