I'm going to talk about AI for Large Imperfect Information Games, in particular, on how emitted AI that beat top humans in no-limit poker. Okay. So, for starters, this talk is going to be about imperfect-information games in general. I'm not going to talk about perfect-information games like chess or Go, it will be applicable to poker, but also more generally, any strategic interaction that involves hidden information, for example, security interactions or negotiations. I think this is really important for bringing AI into the real world, because the truth is most real-world strategic interactions involve some amount of hidden information.
So, when it comes to these games, poker has served as the primary benchmark challenge going back decades. In fact, if you look at the original papers on game theory, pretty much the only application they talk about is poker, because it's so accurately, it captures the challenge of hidden information. Particularly, there's a variant of poker called heads-up no-limit Texas hold'em that has emerged as the primary benchmark for these games. A heads-up no-limit Texas hold 'em is a massive game. It has about 10 to the 161 different decision points. It is also the most popular variant of poker in the world https://casinoslots-sa.co.za/casino-games. For example, no-limit Texas hold'em is the game that is played at The World Series of Poker main events. Every year the winner is determined by Heads of No-limit Texas Hold 'Em. It's also featured in popular movies about poker. For example, Casino Royale and Rounders. In some ways, you could argue it's the purest form of poker. It's subjective, but it is a very strategic game, whether you win or lose, it's entirely up to your skill. It's not up to the other players at the table, except for your own opponent, I guess. So, there's no kingmaker effects for example, and no pro AI has been able to beat top humans in this game. That isn't till 2017. So, in 2017, we organized something called the Brains vs AI Challenge. We created an AI called the Libratus, which we played against four of the world's best heads-up no-limit Texas hold'em specialists in the world. These are all people that make about seven figures per year playing this game online. As we played 120,000 hands of poker over the course of 20 days, and there was a $200,000 prize pool divided among the pros to incentivize them to play their best. So, they weren't risking money, but how much money they want, depended on how well they did relative to the other players. So, obviously, if you're familiar with poker, you might not have heard of these pros. So, I wanted to say a word about how strong these pros are, because it really is important to play against the top players. Unfortunately, there are no objective rankings of professional poker players. But like I said, these are all players that make millions of dollars a year. In fact, here's a question from the poker subreddit, where somebody was asking, ''How good are these players that we were playing against?'' Somebody responded, ''These players will absolutely trounce all the 2,000 heroes that you might have heard of. The heroes from 2000s would be division three college players. Well, whereas these guys are all star caliber pros.'' So, this is a pretty accurate description I would say. There is a big scale difference between the a pros that you see on ESPN, and these guys who actually play this game for. The guys you see on ESPN are basically celebrities. These guys are the guys that actually make a living playing this game. The final result is that Libratus beat the humans in this game by a lot. The victory margin was 147 mbb/game, which is a measurement of win rate and poker, which, unless you are an actual poker player doesn't mean much, but to give you some perspective, this is about three times the win rate of a top pro versus an average pro. It was statistically significant at about four standard deviations, and each human lost individually to the AI. This was a big surprise to everybody. In fact, when we announced the competition, there was a betting market on the outcome, because it's the poker world, and obviously, like to gamble on these things. When we first announced that we're going to do this competition, the betting odds were four to one against us. In fact, even after we won on the first day, the betting odds were still two to one against us. I think I was until like the third day that the betting odds were even, and by the eighth day, you couldn't even bet on the outcome of the competition anymore. You could just bet on how much each human would lose on each individual day, because it was clear at that point that this AI is going to win. In fact, even if you asked us, we were not very confident that we would win. I put our odds at about like 60 percent, maybe 65, but I didn't think we would have a lot victories like this. Actually, after this competition, we did another competition against these Chinese pros. So, basically, somebody called Kai-Fu Lee in China called us and he said, ''We would like you to do another competition in China against Chinese players. We will broadcast it, it would be a lot of fun.'' We were like, ''Well, why should we do this? Because we just played against the top humans. These Chinese players not as good.'' He said that he would pay us. So, we said, ''Okay, great.'' So, we played 36,000 hands against six Chinese players. We beat them by even more than we beat the top humans in America. That was actually a huge hit in China. It was watched live by millions of people during that competition. They had really nice production where you could see a poster like this. It was way better than what we did in America. All right. So, why are imperfect-information games so hard? After all, we have AIs that can beat humans in games like chess, we have AIs that beat humans in Go. In fact, you might have heard recently about AlphaZero which can beat humans. Well, it's essentially superhuman in chess, Go, and shogi, all using the same algorithm. So, what is it about imperfect information games that are so difficult? One of the major challenges, not the only one, but one of the major ones, is that in an imperfect-information game, the optimal strategy for a subgame, for part of the game, cannot be determined in isolation. It cannot be determined using information in just that subgame alone. So, let me show you what I mean. Before I get to that, deep learning has taken a lot of credit recently for a lot of the breakthroughs in AI. Actually, all AI did not use any deep learning, no deep learning at all. But I would also argue that a big reason for why all these AIs are superhuman in various games like chess, Go, backgammon even, is because they use real-time planning. The planning component is huge. In AlphaGo, for example, use Monte-Carlo Tree Search, in Deep Blue, it used Alpha-beta pruning. So, in fact, if you look at AlphaZero, without real-time planning, I guess this is washed out, but it ends up being right around there without Monte-Carlo Tree Search during real-time. Top human performance is right around here. So, in fact, without Monte-Carlo Tree Search, AlphaZero is not superhuman. The tree search gets you 2,000 ELO addition. So, real-time planning is really important, not just in Go, but also in poker, it turns out. This is actually the key breakthrough that allowed us to be top humans is figuring out how to do real-time planning. But it turns out that in poker, it ends up being way harder which is where it gets you right now. So in perfect-information games, you take some action, your opponent takes some action, you find yourself in a particularly subgame. Now, you can forget about everything that came before, all the other situations you did not encountered. The only thing that matters is the situation that you're in, and the situations that can be reached from this point on. So in perfect-information games, so for example, if I were to show you this chess board, you don't have to know how we ended up in this situation, you don't have to know about the Sicilian defense of the Queen's gambit. You can just look at this board, and if you're white, you can say, ''Okay, well, if I do a search, I can see if I move my white queen there, then it's checkmate, and the game is over. So, I should just do that. You don't have to know anything about the strategy of chess. But in imperfect-information games, if you take some action, and your opponent takes some action, and you find yourself in a particularly sub-game, now some other sub game that you are not in, and in fact, you might not even be able to reach from this point on, can affect what the optimal strategy is for the sub-game that you are in. This is counter-intuitive, but I'm going to give you a concrete example in a little bit that illustrates this. Now, before I get to that, I want to talk a little bit about what our goal is in these games. Our goal is to find a Nash equilibrium which in-two player zero-sum games, is the same thing as a min-max equilibrium. I won't get too technical about the definition, but basically, in a two-player zero-sum game, if you're playing the Nash equilibrium, you are guaranteed to not lose an expectation. Now, it's not always easy to find a Nash equilibrium, but it's always guaranteed to exist and a finite two-player zero-sum game. So, for example, in rock, paper, scissors, the Nash equilibrium is to this mix randomly between rock, paper, and scissors, with equal probability, because if you do that, then no matter what your opponent does, you will not lose an expectation. Now, in rock, paper, scissors, that also means you're not going to win an expectation, but in a complicated game like poker where there's a lot of sub-optimal actions that aren't actually played in the Nash equilibrium, it's likely that your opponent will make mistakes and you will end up in practice winning as well. Yes. >> How important is it going to be to play a game So, if I compare this to say, heads up or not. If I got a heads up, if I got to sort of thinking about like this will go about seven players. >> That is a great question. So, I'll get to this, let's talk about this now. In poker, it doesn't really matter. So, in poker, if you were to use these same techniques for six player poker, you would almost certainly win. That said in general, poker is a special game because, I don't know if you play poker but two special things about poker. One is, it's really hard to collaborate with other players. So, you can't say, "Hey, let's team up against this other person at the table." In fact, if you try to do that, that'll be against the rules of poker. The other thing that's unique about poker is that people fold in the game. So, even if you have six players at the start of the game, it very quickly comes down to two players because people fold. So, you can use these techniques that are only guaranteed for two-player zero-sum games and it will just work in six player poker. But a big challenge is extending these techniques to other games that do allow for collaboration. In that, we don't really have a good approach for those games yet.
1 Comment
8/1/2022 07:34:10 pm
Düşen, azalan takipçi sayılarından bıktınız mı? Eğer cevabınız evet ise Instagram düşmeyen takipçi satın al seçeneği ve çeşitleri sizleri adresimizde bekliyor.
Reply
Leave a Reply. |
|