33 Replies to “Tuomas Sandholm: Poker and Game Theory | Lex Fridman Podcast #12”

  1. When will they make a Stratego AI that can beat humans? Most Stratego games are simple to beat for even mediocre human players.

  2. Here are my quick notes on this. Btw, Lex, if you're listening, I see you have some videos in the podcast with very little info in the description and no timestamps. If you wish, I could save you some time and timestamp them myself, for a reasonably small fee we can later negotiate. My background is similar to yours, even though I'm not as experienced or distinguished, I'm an electrical engineer working with software development (and I also have a small YouTube channel, so I know some of your plights). My email is [email protected] . I also have a website, if you wish to know more about me before making any decisions: fanaro.io .

    ————————————

    1. 0:00 Introduction
    – The leader of the first team to beat top Texas Hold'em players.
    – He had the best paper at NIPS in 2017.
    2. 1:20 Texas Hold'em
    – Imperfect information
    – Most popular imperfect information game in AI.
    3. 4:00 Libratus
    – 4 expert players.
    – Couldn't really play for money.
    – Thought it would be 50-50. Betting sites thought it would be 1:4.
    4. 8:50 The Role of Tells in Poker and AI
    – Top players are too good at hiding tells.
    5. 10:30 What kinds of abstractions are useful in Poker?
    – There's information abstraction and action abstraction.
    6. 13:05 Does information matter or are actions enough? Do the hands really matter?
    7. 15:00 Libratus did not use learning methods
    – In imperfect information games, it's all about the opponent's beliefs.
    – It works on a search at the leaves.
    – DeepStack had another deep learning search method.
    8. 19:00 Beliefs are explicitly modeled, but not assumed
    – Libratus learns the belief system of the opponent.
    – Beliefs are derived from Bayes' rules.
    – It'a all about Nash equilibrium, about a rational opponent.
    – Opponent adaptations are still in the works.
    9. 25:30 Different Types of Games
    – Repeated games: the same game is played over and over.
    – Purely repeated games are very rare.
    – Stochastic games: distribution on next games.
    – Poker.
    – General sum games with multiple players seem to be much more difficult to model.
    – All finite games have Nash equilibrium, the problem is that there could be many.
    – Cooperation and collusion would make things much harder in Poker.
    – Bridge is like that.
    10. 33:00 Possibly transferring knowledge to related real world domains.
    – E.g. auctions, negotiations.
    – He has 2 startups in this area:
    – Strategic Machines
    – Strategic Robots
    11. 34:00 How far is human behavior with these techniques?
    – You shouldn't care if the opponent is not following your model of rational behavior, the end result would be even better for me.
    – Could jay-walking and other driving interactions be modeled as in Poker and other imperfect information games?
    – Could you pre-negotiate certain situations.
    12. 38:00 Takeaways from going end to end (science to engineering and commercialization)
    – He does like going through the whole spectrum.
    – Testing things in reality is necessary, after all we live in the real world…
    – People became very much afraid of this new AI stealing their money.
    – Bravery to implement the ideas in the real world? Tuomas thinks it's more about a lot of work.
    13. 43:00 Mechanism Design Lessons
    – Designing mechanisms such that desired outcomes come more frequently and easily.
    – Not a panacea.
    – You can't really solve everything within a class, but you can put undesirable results in certain islands.
    – Meyer's Set Weight Theory: imperfect trade in certain settings. They've shown that you can find setting where trade is still efficient.
    – Mechanism design is still not able to take all of the details from reality.
    – In auctions, telling the truth is usually not the best strategy with current mechanism designs.
    14. 48:00 The Next Big AI Breakthrough
    – Doesn't really know the answer.
    – Poker was one of the major milestones.
    – There's no consensus.
    – Starcraft
    – Dota II
    – Stock Market?
    – He wants to push AI into practical projects.
    – Game Theory is not really on par in terms of application with respect to Machine Learning.
    15. 52:00 In Machine Learning, neural networks are typically at first not explainable. Is it the same with Game Theory?
    – Yes and no.
    – Nash Equilibrium has provable properties.
    – But that doesn't mean strategies are human understandable.
    – Would military applications be understandable?
    – He believes so.
    16. 54:00 Concerns about the negative impact on society.
    – He is very much optimistic.
    – AI is going to make the world much safer.
    – Does value misalignment worry you?
    – The problem is not in value alignment, but in objective alignment. If your objective is to have maximum utilization of assets, then you're gonna load everything to the max and drive in circles.
    – The 2 biggest threats: climate change, nuclear war.
    – The underlying problem is political will.
    – Mentions the simple game theory of mad. (Mutual Assured Destruction)
    17. 1:01:00 Plans for the future
    – Coming up for techniques for game solving and applying them to the real world.
    – It's hard to come up with things to be done, it's much easier to have the applications telling you what needs to be done.
    18. 1:02:00 The challenge of dealing with the transition from older technologies and bureaucratic systems
    – Tech companies are taking the lead in this regard sometimes.
    19. 1:03:30 What ideas and techniques do you have as promising or exciting right now?
    – Longer games, collusion, hidden action games, etc.
    – More scalable techniques for integer programming.

  3. As a pokerplayer i find this conversation really interesting. Not much comes out of the highstakeworld online but i can gve some information. In both )texas holdem and pot limit omaha the topplayers use alot of tools to study and play GTO. However its not humanly possible to play GTO you can spend 100s of hours for spots that if you play professionally never will occur during your lifetime. Therefore topplayers focus hard to study the low hanging fruit commonly occuring situtaions. As i said all toppplayers study GTO but the player i think is the best is Berri Sweet whos playing superexploitive. For sure he knows and studies GTO but his playstyle is exploit

  4. This professor with the breakthrough that he made in AI deserved a better interview platform than this… PR game is weak, maybe AI can help you with that Professor LOL????

  5. I share the opinion that applying game theoretical tools to computing applications is challenging. Some 15-20 years ago there was a game-theory hype and some interesting results appeared. Around that time, I was working on those kinds of models. In my experience, applying game theoretical solutions concepts necessitates some kind of overhead work. For instance, in my case I applied Nash bargaining theory for multi robot coordination. The robots had to formulate their private utility profiles at the coordination point, share the profiles, and calculate the full game equilibria (with the help of the other robot). Since the game in the general case had multiple equilibria, the initiating robot also shared a random number with the other robot and both used the number to choose the same joint solution, and then finally implement the individual part.

  6. Why wasn't each poker pro given an HUD to real-time track Libratus' betting stats? This is something all online pros use (and such pros would never agree to play high stakes online without one). Libratus certainly had access to the equivalent of an HUD, right? I presume this created significant imbalance between Libratus and the humans.

  7. Games that go on forever … Pi calculation?
    I know it's not the same and I haven't looked up if this has been done but it just occurred to me: I wonder how well a learning algorithm would do at predicting Pi digits.

  8. 19 minutes of watching and this turn out to be most interested conversation i had listen so far.. he makes a point here.. other then deep reinforcement learning "to be precise learning only' there are non learning methods ..
    if we can somehow introduce learning in such methods then that will be something new

  9. It would be interesting to see AI secretly "tell" a person how to play poker against another human. I wonder if there would be any diffrence.
    Or if people thaught that they are playing another human being.

  10. Big thanks to you, Lex, for bringing some of the smartest researchers and practitioners to the table and sharing these great interviews with the world.

  11. It seems like people have a hard time predicting things with exponential growth. So humans will probably systematically underestimate the performance of AI for a while yet. At some point it will switch and people will assume AI to be better at everything.

  12. Can anyone knowledgeable about ai and autonomy recommend a reading list for the average layman to gain better insight on such systems. Thanks for this video lex.

Trả lời

Email của bạn sẽ không được hiển thị công khai.