BBO Discussion Forums: Systrem performance metrics - BBO Discussion Forums

Jump to content

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

Systrem performance metrics

#21 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2025-March-04, 02:44

View Postawm, on 2025-March-03, 16:13, said:

I feel like there's something weird about the way you're evaluating this, such that opening lighter is virtually always better (to the degree that a one point difference in 1NT opening massively moves the needle and EHAA looks like by far the best of the systems).

I think it is OK that EHAA works well on these metrics. It does get a lot of information across in the first bid, and if playing with a partner who always makes insufficient bids and bars me from the rest of the auction, I think I would rather play EHAA than a more reasonable system.

EHAA may be a bad system overall because the 2-openings leave too little space to sort out anything, but the metrics I have looked at so far don't address this (other than the aggresiveness in the scatter plot).

On the other hand, that it made such a big difference to take the balanced 11-counts out of the IMPrecision 1 opening is indeed weird. I think part of it is down to walrus mentality of my scoring system, but I will try to look a bit deeper into it.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#22 User is offline   awm 

  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 8,450
  • Joined: 2005-February-09
  • Gender:Male
  • Location:Zurich, Switzerland

Posted 2025-March-04, 05:43

Part of the problem is the emphasis on “know you have game.”

If opponents preempt and I have say 14 points, it’s really helpful to me if partner opened because now I know we have game values, whereas if partner passed it’s tougher.

But if I have say 19, I expect a game even opposite a pass. If partner opened it is not so much help to me (actually it might make me think we have slam, so a marginal opening by partner could actually hurt here).

I think you are giving credit for opening the balanced 11 any time partner has 14+, whereas I think it really only helps when partner is in the borderline range of like 14–16. Against that, you are behind when partner has 12-13 because partner could’ve made a better decision if you had a higher minimum.
Adam W. Meyerson
a.k.a. Appeal Without Merit
1

#23 User is online   DavidKok 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,702
  • Joined: 2020-March-30
  • Gender:Male
  • Location:Netherlands

Posted 2025-March-04, 06:17

I think a better metric is 'uncertainty that we have game', i.e. the entropy of the binary yes/no question, conditional on our hand, partner's first call and the preempt.
1

#24 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted Yesterday, 13:09

Something I find perplexing, when looking at entropy of the opening bid versus what we can call "information effectiveness", i.e. entropy divided by the maximum entropy that is possible given the system's level of aggressiveness (average opening bid height):
Posted Image
There is a clear negative correlation. In other words, the more aggressive systems such as EHAA do transmit more information with the opening bid but not as much as they "ought" to do. It doesn't have to be that way - one could in principle design an aggressive system that also transmitted lots of information with the opening bid, but it would have to be something different from the kind of systems I have explored. Maybe something like Todd & Atul's Dejeuner system.

We see that Moscito does strike a good trade-off here which is maybe not so surprising given that Marston had a bit of the same obsession as I have, namely to design a system that is "optimal" in a very theoretical sense. But funny also to see Norwegian standard doing well here.

It is related to another scatterplot, namely entropy versus the probability that the opening bid already takes us beyond the safety level from responder's point of view:
Posted Image
Norwegian Standard and Cottontail Club do well here, both 4-card major systems with strong NT.

This is of course still quite crude. I would like to develop something closer to the Useful Space Principle, i.e. that a weak 2 opening often leaves no space below the safety level is not so bad, responder can usually just pass. That a Polish 2 opening often doesn't leave any space below the safety level is a bigger problem.

Also, with respect to Adam's comment about bidding game with 19 points: maybe, instead of defining safety as 100% safety, I could define it as e.g., 95% safety in uncontested auctions and 75% safety after an enemy preempt. This also has the advantage that the 25 percentile is statistically more stable than the minimum, so I would need fewer sims.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#25 User is online   DavidKok 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,702
  • Joined: 2020-March-30
  • Gender:Male
  • Location:Netherlands

Posted Yesterday, 13:22

Maximum entropy up to a given level is attained with a uniform distribution. I think that is why EHAA scores high - the entropy should be strongly correlated with using high opening bids relatively often, even conditional on the mean opening level.

If you guaranteed had the auction to yourself, you wouldn't go to far wrong by maximising information density (though there are tradeoffs with safety levels and information leakage - so this is by no means a safe assumption!). Traditional thinking has it that natural systems under this condition want to approximately reduce the frequency of each subsequent call by 50% compared to the one before that, e.g. half of all hands pass, a quarter opens 1, an eight open 1 etc., while relay systems have a theoretical (less information dense) limit of a factor 1.618.

Conversely, if I have it in enforcable legal writing that my LHO is about to bid 3 over my opening, regardless of what I do, I maximise the information shared by picking a uniform frequency distribution from pass to 2NT inclusive (and some smidgen assigned to 3 and up).

Put differently, if the opponents don't jump the auction, we have more space after cheaper bids, so we want more hands in it (to entangle later).

In practice, not only are the frequenty arguments too simple to be of much use for system design, also the lack of knowledge on which type of auction we are about to enter suggests something between these extremes. I am not convinced that entropy of the opening distribution conditional on the mean opening measures much other than level of aggression. Instead the uncertainty in partner's decisions conditional on our information is probably of more interest.

View Posthelene_t, on 2025-March-05, 13:09, said:

Also, with respect to Adam's comment about bidding game with 19 points: maybe, instead of defining safety as 100% safety, I could define it as e.g., 95% safety in uncontested auctions and 75% safety after an enemy preempt. This also has the advantage that the 25 percentile is statistically more stable than the minimum, so I would need fewer sims.
This is where entropy is useful. Responder can look at their hand, at the preempt, at the opening, and reason something like "there is a 85% chance that we have game" (e.g. based on a double dummy simulation, or on a less expensive evaluation metric such as "either we have 25 HCP, or a major suit fit and something that re-evaluates to 25 HCP"). The entropy of that yes/no question is a good quantitative criterion of the difficulty of deciding whether or not to go to game.
1

#26 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted Yesterday, 14:48

View PostDavidKok, on 2025-March-05, 13:22, said:

Conversely, if I have it in enforcable legal writing that my LHO is about to bid 3 over my opening, regardless of what I do, I maximise the information shared by picking a uniform frequency distribution from pass to 2NT inclusive (and some smidgen assigned to 3 and up).

Yes, EHAA is still not quite that extreme but indeed, this very high entropy relates to its good performance when opps preempt.

The relatively poor performance of the more aggressive systems in these two scatterplots is, I thought, partly related to too low frequency of the 1-of-a-suit openings relative to pass. But the numbers don't support this idea. EHAA passes 33% of hands while the geometric distribution corresponding to its aggressiveness would pass 23% of hands. The numbers for Cottontail 50% versus 36% (Cottontail is the only system in my basket that opens all balanced 11-counts and also has a weak two in diamonds so it passes slightly less than other normal systems). So the ratios are about the same.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#27 User is online   DavidKok 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,702
  • Joined: 2020-March-30
  • Gender:Male
  • Location:Netherlands

Posted Yesterday, 15:39

View Posthelene_t, on 2025-March-05, 14:48, said:

(Cottontail is the only system in my basket that opens all balanced 11-counts and also has a weak two in diamonds so it passes slightly less than other normal systems)
Some systems with a Kamikaze or Chicken (i.e. variable) notrump also have these properties. Notably Dutch Doubleton or Swedish/Polish club with a 10-13 or 9-12 1NT opening, even if it depends on vulnerability. Ironically these systems regularly don't have room to open a bunch of unbalanced 11-counts, even though they regularly open balanced 10- or even 9-counts. Though I think your basket of systems already contains a lot of options, I just wanted to mention it as a curiosity.
0

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

10 User(s) are reading this topic
0 members, 10 guests, 0 anonymous users