Post by LornePost by KWSchneiderPost by KWSchneider1) Dummy has exactly QJ.J6.KJ95.KJ752
2) Opener has 12-14 HCP - no constraints were placed on tens.
3) Opener is balanced with no 5+cM
Note that the DD and SD results were run concurrently on the same hands.
Sim1 - DD = 41.7% (8.04 trick avg); SD = 57.0% (8.62)
Sim2 - DD = 40.6% (7.98 trick avg); SD = 57.1% (8.62)
I found it interesting that the SD results were remarkably consistant across the two sims. Since these take 10 hours to run, I intend to repeat the Sims 3 more times to yield 5000 results and hopefully report tomorrow night (EST).
And in view of the 57% SD outcome, I feel vindicated in my position that the result in the posting was more bad luck than poor decision making.
On the other hand, I'm concerned that the DD results were not closer to Travis' 38%. I'd be interested in seeing what the differences in simulation constraints were. Perhaps over 10,000 sims mine will average out to 38%, although they are not trending in that direction.
If anyone would like more detail, please let me know. I have a spreadsheet available for each sim, with the 1000 deals, each hand, and the DD/SD result.
Summary of Results (1000 deals each)
Sim1 - DD = 41.7% (8.04 trick avg); SD = 57.0% (8.62)
Sim2 - DD = 40.6% (7.98 trick avg); SD = 57.1% (8.62)
Sim3 - DD = 39.4% (8.00 trick avg); SD = 58.0% (8.66)
Sim4 - DD = 39.2% (8.03 trick avg); SD = 56.9% (8.62)
Sim5 - DD = 37.6% (7.94 trick avg); SD = 56.0% (8.58)
Over 5000 deals
DD = 39.7% (8.00); SD = 57.0% (8.62)
Kurt
How confident are you with the SD code? My understanding is that all SD
calcs are done using monte carlo sims with DD calcs for each sim. This
can lead to odd results in certain circumstances eg it is well known
that most SD software will play a trump suit of AKQ10xx opposite xx by
cashing the ace, crossing to dummy, and leading to the K taking an
unecessary risk of a ruff when crossing to dummy.
My concern in this sim is that real defenders look at partners lead and
make assumptions - eg the lead of a K vs NT suggests KQ10xx most of the
time, lead of a low card denies such a holding plus some other touching
honour situations and promises at least one high card for most, but does
the software take this into account when picking deals to use in the sim
for SD plays for a defender ? I can see a situation where failure to
make assumptions available from a lead or lack of a certain lead will
cause the SD result to be inferior to real play for the defenders.
ps if you are wondering why SD software misplays AKQ10xx opposite xx it
is because each individual sim uses DD play to pick the best card(s),
and DD you can pick up all 3-2 breaks + Jxxx onside. Since DD you can
make more contracts if you cross to dummy after the ace (the risk of a
ruff is smaller than Jxxx onside) the sim will say that is the best play
but then having got to dummy the next sim says low to the K is better
than a finesse. It is all because the DD test for each monte carlo sim
will always know when to finesse and when not to after crossing to dummy.
Indeed, the software uses an intelligent Monte Carlo method - it allocates cards based on bidding (both pairs), lead, type of game (IMPS/MP), and of course what's been played, and then falls back on DD for the eventual line decision. For example, based on the contract, the program determines a number of potential lines via Monte Carlo (allocating high cards and distribution based on bidding or lack thereof), and then plays each line 50-250 times DD (depending on 'skill' level). It scores each line (success %).
For example - here is typical output for a specific declarer play in 3N (in decreasing likelihood of frequency):
03C 05D 08H 17S - DA DQ -> 9
03C 55D 07H 17S + DA D8 D7 D6 D2 - DQ DT -> 9
11C 55D 44H 33S + HA HK H9 - HT -> 9
11C 66D 33H 33S + HA HK -> 9
This line makes whenever LHO has 0-3 clubs, 0-5 diamonds, 0-8 hearts and 1-7 spades with RHO holding the DAQ OR LHO having 0-3 clubs, exactly 5 diamonds, 0-7 hearts and 1-7 spades with DA8762 and RHO holding the DQT OR LHO with 3=4=5=1 distribution and the HAK9 with RHO having the HT OR FINALLY LHO with 3=3=6=1 holding the HAK.
In this case, based on the bidding and other conditions, was scored a 96% line. Ultimately it compares each line and goes with the best success percentage.
But as I indicated in my post, the area where SD falls 'short' is in the inference area. But that is made up by 'always' knowing the long game, and frankly, not making mistakes.
If someone has a a large electronic set of 'human played hands' which could be automated to be played SD, it would give us a quantitative basis for comparison. But outside of that, I have full faith in the software and have yet to find a hand where the results were inconsistent with quality human play.
And finally, if you'd like to give me a few hands where you expect the SD results to be suspect, please let me know and I'll contact you offline. Also, we would need to quantify how often (in the course of all hands), these particular set of circumstances arise. Remember, in the scheme of things, I believe we only need to be within +/-2% to have confidence in the method. I doubt that these few anomalies would account for more than a fractional precentage of all hands.
Kurt
--
Posted by Mimo Usenet Browser v0.2.5
http://www.mimousenet.com/mimo/post