1) Generate a bunch of deals roughly consistent with the auction.
2) Calculate the double dummy result for each card you can play.
3) Pick the card that averages the best score.
(On top of this, when advanced GIB is declaring, there is a single-dummy algorithm that kicks in after trick 2 that tries to make a plan to avoid putting off guesses. But this is not relevant to defending, or the cases discussed in this post).
There is course a lot of complexity in step 1 - how does it find hands that match the auction? What if the auction is impossible, or so rare it can't be simulated? If it allows for some variation - as has been stated by barmar in the past - how does it know if a deal is 'close' to matching?
But steps 2 and 3 are trivial.
5 years ago gwnn posted a hand where GIB plays a card at trick 10 that is guaranteed to be strictly worse than any other card, if any nonzero number of hands is simulated:
That blew the usual 'maybe you just got a very very unlucky set of sims' excuse out the window.
It has been bugging me every since - to be honest, it's very rare a week has gone past over the last several years where I don't think "gah, I wish I knew what GIB was doing".
While my attempts to get access to the source code have failed, I can finally announce I know why it made (and continues to make) these mistakes.
And that is because, rather shockingly, GIB does not perform step 2 as everyone believes it does.
--
A couple of weeks ago, Lorand Dali posted about his new AI bridge project. Very interesting stuff and worth a read. I was slightly disappointed to find out it was entirely reliant on a having a pre-existing robot - learning to bid from a huge sample of hands generated by the GIB robot.
Until I discovered how he generated the hands - not via online GIB hand records as I had expected, but by piping them into the bridge.exe program that freely comes with the Windows downloadable version of BBO.
Wait, there's a free command line version of GIB?
Yes - though of course, it's a version from 2012, so extremely out of date in terms of the bidding. If you think GIB has bidding flaws now, BBO did an amazing job of improving it from where it started while they were still working on it.
How does this help? Because Matt Ginsberg added some debugging flags, documented in an archived version of his website. While the BBO version appears to have been altered somewhat, with some of the flags not working and some workarounds needed, there's still one available which outputs a trace of all of the simulated hands, how GIB scored them, and how that averaged out to its choice of play.
For example, when trying to decide what to lead to trick 1 in gwnn's example hand, it generates 100 hands (this was another flag, I used 100 but BBO will be even less) and displays them each in this format:
deal 76: S A Q 7 5 4 H 9 D A K 7 2 C J 7 6 S 9 8 3 S J T H A J T 7 5 H K 6 4 3 2 D J 4 D Q T 6 5 C K T 5 C Q 4 S K 6 2 H Q 8 D 9 8 3 C A 9 8 3 2 West to lead; S trumps mismatch 32.00 CK: 100 CT: 100 C5: 200 DJ: 300 D4: 300 HA: 200 HJ: 200 H7: 200 H5: 200 S9: 200 S3: 200
The first bunch of deals don't include the 'mismatch' line - the last group has increasing mismatch scores, which is presumably widening the range of hands it considers acceptable in include in the the simulation.
And then at the end, a conclusion that averaged out of the double dummy results over all of the deals:
S3: -24.70 -> 2.45 DJ: -27.70 -> 2.34 S9: -24.70 -> 2.15 D4: -26.70 -> 2.07 HA: -119.20 -> 0.95 HJ: -270.80 -> -0.83 C5: -275.70 -> -0.99 H5: -289.10 -> -1.10 H7: -289.10 -> -1.10 CT: -280.70 -> -1.45 CK: -339.70 -> -2.39 I play S3
I expect the DJ got a slight boost due to signalling, but so far it's all making sense.
There's just one small catch.
Some of the earlier deals in the set have question marks after some of the double dummy results:
deal 0: S A Q T 7 2 H 4 2 D Q 9 3 C 9 7 2 S 9 8 3 S 5 H A J T 7 5 H 9 6 3 D J 4 D K T 8 7 6 2 C K T 5 C A J 6 S K J 6 4 H K Q 8 D A 5 C Q 8 4 3 West to lead; S trumps CK: 300? CT: 400? C5: 400? DJ: 400? D4: 400? HA: 300? HJ: 400? H7: 400? H5: 400? S9: 400? S3: 400?
In this case, the first 32 deals have ? after all results, and the remaining 68 have none - though on other occasions, some deals have ? for some play cards and not for other played cards.
And most importantly - some of the scores with ? are incorrect.
Look at what happens when we get to the crucial card at trick 10 in gwnn's case.
The first simulated deal:
deal 0: S J H Q 8 D --- C --- S --- S --- H A J T H 4 D --- D 6 C K C Q S --- H 9 6 D --- C J West to play to H2, H3, HK; S trumps N/S have taken 9 tricks HA: -1430? HJ: 100?
The question-marked figures say that playing the heart Ace will allow the slam to make - and the J will cause it to go down!
In fact, the first 44 simulated hands all have the same conclusion.
On deal #44 (0-indexed!), it gets it right for the first time:
deal 44: S J H Q 8 D --- C --- S --- S --- H A J T H 9 D --- D 6 C K C Q S --- H 6 4 D --- C J West to play to H2, H3, HK; S trumps N/S have taken 9 tricks mismatch 16.00 HA: 100 HJ: -1430
Note that there is nothing special about this deal that separates it from the others - the exact same hand with East left with holding 9-6-Q appeared several times in the first 44.
On deal 45 it's also correct, but 8 of the next 19 hands it has the incorrect figures, before all others are correct.
So as the final result, on 52 of the 100 hands, it believes ducking is required to beat the contract - when it isn't true once. When it combines 52*100 and 48*-1430, you get 63440 - which it provides as its final output:
HJ: -634.40 -> 0.68 HA: -695.60 -> -0.68 I play HJ
Oops.
I took a second example, posted by bixby a few months ago, where throws away its high card on trick 10 in a no-win, rarely-tie, mostly-lose scenario.
Is this because it was unlucky and every hand it simulated resulted in the equals case?
On my run, it found the equals case just 5 times - no question marks:
HK: -690 HT: -690
On 19 occasions, it had a definitive value for the heart T, but thought - with a question mark - that throwing the king would get a *better* score
HK: -630? HT: -660
On the other 76, it came up with the right values:
HK: -690 HT: -660
In this case, that was enough to weight it to making the correct play (it chooses 8 among equals after the analysis):
HT: -661.50 -> 0.57 HK: -678.60 -> -0.57 I play H8
But given it's capable of including completely wrong dummy double analysis scores in its calculations, it's no longer surprising with a smaller / different set of hands that the incorrect ones could end up biasing the results enough to play the wrong card.
--
Note that it's quite possible that BBO have improved the play engine of GIB since v21, which is the one tested here, though all reports have been that they haven't touched it other than forcing it to lead an ace against 7NT.
Conclusion: I don't know why GIB's "double dummy" analysis causes it to give correct scores for some cards, and incorrect scores for others. Clearly, this is a deliberate part of the program, due to the fact it it marking potentially wrong figures with a question mark (they're not all incorrect) - not that it is intentionally making mistakes, but I assume it is running some sort of optimization that speeds up the double dummy calculations rather than guaranteeing correct output.
If this is required for some reason, why it does not at least switch to guaranteed correct output at least for later in the play when this should be fast, I also don't know.
But at least I know more than I did.