This has been bothering me a long time. GIB works by simulating hands which match the bidding and play* so far, and then choosing the card that gives the best average score, double dummy. So why does it sometimes get it so wrong - such as steve2005's recent example? While we don't know how many hands basic GIB generates (barmar has said the advanced robots simulate "a few dozen"), when I ran the numbers, even simulating 10 matching hands would have a success rate of > 99.99%.
The real question seems to involve going back one step. How does it simulate hands in the first place?
Like Dealer does, there is only really one viable starting point. Randomly shuffle all cards that aren't in your hand / haven't been played amongst the other 3 players. Then test whether that is a 'valid' deal. If you ask Dealer to generate 500 hands where North has exactly 4-3-3-3, it will generate about 19000 hands in order to find 500 valid ones.
The more tightly defined the constraints, the more hands it will need to generate in order to find valid matches. For example, when I considered steve2005's deal from West's perspective, it's only able to find about 80 matching deals after generating 10 million hands (and that takes about 10 seconds). In an earlier post from helene_t, it was coming up with only 0 or 1 match amongst those 10 million deals generated in 10 seconds.
And of course, basic GIB, at least, clearly plays faster than 10 seconds per card.
At this point you may think that we have an answer - if GIB isn't able to simulate matching hands, it just chooses cards at random. But that cannot be true; even when humans make bids that are 100% impossible, it's capable of at least some very basic trick taking - while it can occasionally throw away a winner late in the play, it doesn't just start playing every single card randomly from trick 1.
So GIB therefore must have some logic built in that bulks up its simulated results with hands that are 'close' to what it knows so far.
From a human perspective, we could think of some ways this could be done. For example, if someone has shown a certain point range, allow hands slightly outside that range. Or perhaps stretch something about the length or quality shown in a particular suit. That is something we could easily do when putting constraints into Dealer - if we have hcp(south)>=15, try hcp(south)>=14.
Yet that is 100% impossible for GIB. There are no constraints it could loosen, because it has basically no understanding whatsoever of suit lengths or point ranges each player holds - those descriptions you see for bids are an estimate for humans, completely independent of the logic used for making the bid. All it has to go on is - does this hand make this bid by looking up the bidding database - yes or no? Even for a simple 1NT opening, the way the bidding database works, it has zero way of knowing if a hand is 'close' to making a 15-17 opener - either it matches the pattern, or it doesn't.
So I started thinking - from a coding perspective, how could you actually do this? You've generated a random deal, have an auction, and a bidding database that tells you what bids would have been made - how can you tell if something is close?
For steve2005's example, I was thinking perhaps it couldn't find enough hands where East doubled, and ended up adding lots of hands where East didn't double. But that appears to be lot more complex than it sounds at first sight - if you take away a bid in the middle of an auction, it completely affects every bid that occurs afterwards (especially given the database works based on finding what comes after a given prefix of all prior bids). And it would be a bit of a nightmare to code - consider every possible bid, try replacing it with some other bid, see if all of the future bids would end up being the same.. and if you still don't have enough results after doing that, um...
I don't think that really works. A more codeable algorithm would be:
- generate random deals
- for each deal, step through the auction one bid at a time and count how many bids match - stop as soon as you find a single bid that doesn't match, because who knows what'd happen after that
- sort the results by the number of matching bids
- take the top N and perform double dummy analysis - weighting the results based on how many bids matched.
But that would still comfortably result in a club being led in steve2005's example, since the double comes reasonably early on, and still is a clear favorite even if you allow South to have 3 hearts, or North to have non-Smolen hands, and so on..
So.. still not certain, and as usual would love to see the code to know for sure, but I can see lots of ways for GIB to mess up based on the way this has been coded.
* Side point - an interesting fact from Ginsberg's original paper, often overlooked here, is that it doesn't just take into account what cards have been played to date - it always takes into an account a factor of whether they should have been played. That is, during the double dummy analysis, if a player didn't play a card, Bayes' rule is used to weight the deal based on whether double dummy analysis says they should be played it if they held it.
Page 1 of 1
Musings on why GIB makes poor plays
#2
Posted 2022-March-18, 18:18
smerriman, on 2022-March-18, 15:20, said:
* Side point - an interesting fact from Ginsberg's original paper, often overlooked here, is that it doesn't just take into account what cards have been played to date - it always takes into an account a factor of whether they should have been played. That is, during the double dummy analysis, if a player didn't play a card, Bayes' rule is used to weight the deal based on whether double dummy analysis says they should be played it if they held it.
Wait what??? I find it hard to believe that the GIBs on BBO use any logic like that. So many examples where it works to give GIB another option - say you have xx opposite x in one side suit, x opposite Axx (in dummy) in another, and you have to setup the third side suit. It works so often to cash the ace to give GIB an option to try to cash the wrong side suit.
I've always been 100% convinced that GIB doesn't draw any inferences from what I play.
Are you sure this is actually implemented in the actual GIBs running on actual BBO?
The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke
#3
Posted 2022-March-18, 19:22
cherdano, on 2022-March-18, 18:18, said:
Wait what??? I find it hard to believe that the GIBs on BBO use any logic like that. So many examples where it works to give GIB another option - say you have xx opposite x in one side suit, x opposite Axx (in dummy) in another, and you have to setup the third side suit. It works so often to cash the ace to give GIB an option to try to cash the wrong side suit.
I've always been 100% convinced that GIB doesn't draw any inferences from what I play.
Are you sure this is actually implemented in the actual GIBs running on actual BBO?
I've always been 100% convinced that GIB doesn't draw any inferences from what I play.
Are you sure this is actually implemented in the actual GIBs running on actual BBO?
No, I don't know for sure, but I would think it unlikely it was removed.
Your example wouldn't be affected though - because whether you cash the ace or not wouldn't matter double dummy, so there would be no inferences to make. It's more about the Grosvenor type situation. Edit - actually, I see what you mean. Well, I don't know - seems strange given how BBO basically haven't touched the play engine at all that they'd delete half of the algorithm.
His original paper - obviously the defensive signalling part was turned off..
Quote
To conform to the card play thus far, it is impractical to test each hypothetical decision against the cardplay module itself. Instead, GIB uses its existing analyses to identify mistakes that the opponents might make. As an example, suppose GIB plays the ♠5. The analysis indicates that 80% of the time that the next player (say West) holds the ♠K it is a mistake for West not to play it. If West in fact does not play the ♠K Bayes' rule is used to adjust the probability that West holds the ♠K at all. The probabilities are then modified further to include information revealed by defensive signalling (if any), and the adjusted probabilities are finally used to bias the Monte Carlo sample, replacing the evaluation sum_d[s(m,d)] with sum_d[w_d s(m,d)] where w_d is the weight assigned to deal d. More heavily weighted deals thus have a larger impact on GIB's eventual decision.
#4
Posted 2022-March-18, 22:19
If the 'random choice when faced with equal cards idea was correct', doesn't that imply that faced with the same situation GIB would sometimes choose different cards?
This doesn't seem to be the case.
I've played hands multiple times at a teaching table and it always seems to play the same cards even when faced with apparently equal options.
In any event, are there any testable hypotheses that one could examine based on your thoughts?
This doesn't seem to be the case.
I've played hands multiple times at a teaching table and it always seems to play the same cards even when faced with apparently equal options.
In any event, are there any testable hypotheses that one could examine based on your thoughts?
Fortuna Fortis Felix
#5
Posted 2022-March-18, 22:28
pilowsky, on 2022-March-18, 22:19, said:
I've played hands multiple times at a teaching table and it always seems to play the same cards even when faced with apparently equal options.
This is based what random seed is used. After the seed is chosen, the random number generator will always give the same results, so equal bids/plays will lead to the same choices, and it appears that at a teaching table, the same random seed is always used. But a quick hack for forcing a different random seed at a teaching table is rotating the hands 90 degrees.
I suspect the seed might be chosen somewhat deterministically, which is why it's easy to achieve equal seeds in a robot tournament. There was a post on BridgeWinners recently in a human tournament (but with many having robot partners) where every time the robot's LHO was a human, it played one card, and when the robot's LHO was a robot, it played a different card, despite everything else being equal.
#6
Posted 2022-March-19, 02:57
OK, so some may have not thought this possible, but GIB is considerably buggier than I thought.
Setting the second point about played cards aside for a later date, I decided to test whether an 'impossible' bid by a human influenced how GIB plays. To do this, I set up a two way finesse:
With East dealer, undisturbed, the auction will go 1NT - 7NT, while if I double 1NT, it will get passed out.
(To avoid North pulling my double, I gave North 4233 shape as well, as gave myself at South the ♥Q as otherwise North sometimes stupidly threw it under the first heart honor. Varying the spots allows this to be tested at a teaching table with the ability to have different results each deal.)
This is of course a 100% guess. The intention was to test whether GIB will think I have the heart queen for doubling, because that is somehow 'closer' to the 15+ points it's meant to show.
But things got weird before I could get that far.
When I pass normally as South:
a) I led a low club 30 times.
On all 30 occasions, GIB won with the Ace of spades and led a heart to the Ace.
Huh? Shouldn't it play North for the heart half the time? Well, maybe there's a logical explanation - perhaps something about the spade suit makes it always win in dummy, and something about the heart guess makes it want to play it at trick 2. But..
b) I led a low spade 30 times.
On all 30 occasions, GIB won in dummy. On 17 occasions, it led a heart to the Ace. On the other 13 it.. took an immediate finesse at trick 2!
Never trying to drop the singleton queen first. Of course, double dummy will tell it in that case it's safe to lead low, because you'll be able to see the opponents' hands before it's time to make a decision.
c) I led a low diamond 30 times. This is somewhat symmetrical to clubs, so based on case a), it'll always play to the king, right?
Of course not, this is GIB. On 20 occasions it played to the king. On the other 10 it again took an immediate finesse at trick 2.
But wait, there's more. I discovered that on 4 of those 20 times it played to the king, it then tried to drop the queen rather than finessing. So I went back and checked case a) 10 more times - yep, it always plays to the ace, then seems to always cash a spade and diamond.. but then 3 of the 10 times it tried to drop the queen, only taking the finesse on 7 occasions.
I think if I work on this a little more, I'll be able to figure out how many sims GIB is doing. But whatever it is doing, it's not what it should be doing.
I never did quite get to compare what happens when I double, but the behavior of the 'control case' is too extraordinary I don't think it will even be possible to compare..
Setting the second point about played cards aside for a later date, I decided to test whether an 'impossible' bid by a human influenced how GIB plays. To do this, I set up a two way finesse:
With East dealer, undisturbed, the auction will go 1NT - 7NT, while if I double 1NT, it will get passed out.
(To avoid North pulling my double, I gave North 4233 shape as well, as gave myself at South the ♥Q as otherwise North sometimes stupidly threw it under the first heart honor. Varying the spots allows this to be tested at a teaching table with the ability to have different results each deal.)
Quote
predeal east SK, HAT9, DAKQ, CJ
predeal west SAQJ, HKJ8, DJ, CAKQ
predeal south HQ
condition shape(east,3433) and shape(west,3433) and shape(south,3343)
predeal west SAQJ, HKJ8, DJ, CAKQ
predeal south HQ
condition shape(east,3433) and shape(west,3433) and shape(south,3343)
This is of course a 100% guess. The intention was to test whether GIB will think I have the heart queen for doubling, because that is somehow 'closer' to the 15+ points it's meant to show.
But things got weird before I could get that far.
When I pass normally as South:
a) I led a low club 30 times.
On all 30 occasions, GIB won with the Ace of spades and led a heart to the Ace.
Huh? Shouldn't it play North for the heart half the time? Well, maybe there's a logical explanation - perhaps something about the spade suit makes it always win in dummy, and something about the heart guess makes it want to play it at trick 2. But..
b) I led a low spade 30 times.
On all 30 occasions, GIB won in dummy. On 17 occasions, it led a heart to the Ace. On the other 13 it.. took an immediate finesse at trick 2!
Never trying to drop the singleton queen first. Of course, double dummy will tell it in that case it's safe to lead low, because you'll be able to see the opponents' hands before it's time to make a decision.
c) I led a low diamond 30 times. This is somewhat symmetrical to clubs, so based on case a), it'll always play to the king, right?
Of course not, this is GIB. On 20 occasions it played to the king. On the other 10 it again took an immediate finesse at trick 2.
But wait, there's more. I discovered that on 4 of those 20 times it played to the king, it then tried to drop the queen rather than finessing. So I went back and checked case a) 10 more times - yep, it always plays to the ace, then seems to always cash a spade and diamond.. but then 3 of the 10 times it tried to drop the queen, only taking the finesse on 7 occasions.
I think if I work on this a little more, I'll be able to figure out how many sims GIB is doing. But whatever it is doing, it's not what it should be doing.
I never did quite get to compare what happens when I double, but the behavior of the 'control case' is too extraordinary I don't think it will even be possible to compare..
#7
Posted 2022-March-19, 13:52
cherdano, on 2022-March-18, 18:18, said:
I've always been 100% convinced that GIB doesn't draw any inferences from what I play.
Well, you're correct. This time I dealt the robots the following:
I gave myself Kx in South, and ducked when they took the spade finesse.
On all 20 occasions they crossed back to dummy, and led towards hand..
.. then on an incredible 15/20 occasions, played the Ace of spades, only taking the marked finesse 5 times. And thus going down on virtually all 3-1 splits where the queen was onside. Wow.
I'm going to need to test these situations with an Advanced robot to see if it does any better (think I have to wait a week for the basic one to expire though.)
#8
Posted 2022-March-19, 15:00
I bumped up the sample size considerably. So far it has only finessed 1/6 of the time.
Given GIB clearly makes no assumption that South wouldn't duck with the King, the drop would only be a very slight favorite based on vacant places. If GIB simulated 100 hands - and it definitely does less than this - it would still be finessing at least 1/4 of the time.
I wonder if GIB is applying a warped version of 'restricted choice' here - thinking that North is less likely to have the King because they didn't drop it earlier, all cards being "equal" double dummy.. if so, that would be a remarkable bug. Will find out in a week's time when I can rent the advanced bot..
Given GIB clearly makes no assumption that South wouldn't duck with the King, the drop would only be a very slight favorite based on vacant places. If GIB simulated 100 hands - and it definitely does less than this - it would still be finessing at least 1/4 of the time.
I wonder if GIB is applying a warped version of 'restricted choice' here - thinking that North is less likely to have the King because they didn't drop it earlier, all cards being "equal" double dummy.. if so, that would be a remarkable bug. Will find out in a week's time when I can rent the advanced bot..
Page 1 of 1