J. A. Nairn false claims about first to calculate split evs for finite decks

**k_c** · 07-06-2021, 05:40 PM

Originally Posted by ericfarmer

I realize that I may have used misleading language in my phrasing of my question, or rather, by phrasing it as a question at all. That is, those two expected values *are* equal-- I was "asking" to clarify whether the disagreement on this point, that existed back in that 2003 bjmath.com thread, still exists today.

And we can even take this further. For example, instead of conditioning on the case where we split exactly two hands, instead condition on the case where we split exactly *three* hands (so that the pattern of dealing to the splits, using notation that I think MGP started with, is the combination of either NPNN or PNNN). Let random variables X_1, X_2, X_3 be the outcome of the first, second, and third of these hands, respectively. Then E[X_1]=E[X_2]=E[X_3]. Note that this isn't a computational simplification, it's a correct mathematical statement. (Extending this further, in the context of your algorithm description, the "repairing" at the endpoint is essentially really "renaming" of the computed values to reflect an expectation conditioned on a different subset of possible outcomes.(*) This renaming is what seemed to me to be the heart of the bjmath.com discussion. By conditioning as described here, we can skip the repairing, so to speak, and furthermore, generalize easily to arbitrarily large maximum numbers of split hands, using the two Catalan-ish summations discussed back in that bjmath.com thread.)

(*) I should emphasize, as I tried to do back then as well, that you're right that "there is more than one path to the final EV." I'm not trying to argue that anyone's algorithm is incorrect. But I did argue then about the interpretation/understanding of the mathematics-- the "labeling," so to speak-- underlying these algorithms.

E

I can intuitively see you are right and probably have the simplest approach but I don't think I have enough of a math background to implement it competently so I'm left to do the best I can algorithmically. What I have done is avoid the EVn headache by eliminating it completely!

spData[] holds data for 0,1,2 remaining splits
xh[] holds EVx multipliers for 0,1,2,3,4 pair cards removed (need more for more splits)
ph[] holds EVPair_p multipliers for 0,1,2,3,4 pair cards removed (need more for more splits)
p0,p1,p2,p3 is probability of drawing a pair card with 0,1,2,3 pair cards removed

splitEV[remaining splits] = spData[].xh[]*EVx[]+spData[].ph[]*EVPair_p[]
for relevant pair cards removed & remaining splits

I can only compute as a function of probability of successive pair cards removed for up to 3 splits. Otherwise I have to get multipliers algorithmically using this method.

Thank you very much for your input by the way.

// SPL1
spData[0].totHands = 2;

spData[0].xh[0] = 2;

// SPL2
spData[1].totHands = 2 + p0 * (2 - p1);

spData[1].xh[0] = 2;
spData[1].xh[1] = 4 * p0;
spData[1].xh[2] = -2 * p0 * p1;

spData[1].ph[1] = -spData[1].xh[1] / 2;
spData[1].ph[2] = -spData[1].xh[2] / 2;

// SPL3
spData[2].totHands = 2 + p0 * (2 + 4 * p1 - 6 * p1 * p2 + 2 * p1 * p2 * p3);

spData[2].xh[0] = 2;
spData[2].xh[1] = spData[1].xh[1];
spData[2].ph[1] = spData[1].ph[1];

spData[2].xh[2] = 8 * p0 * p1;
spData[2].xh[3] = -12 * p0 * p1 * p2;
spData[2].xh[4] = 4 * p0 * p1 * p2 * p3;

spData[2].ph[2] = -spData[2].xh[2] / 2;
spData[2].ph[3] = -spData[2].xh[3] / 2;
spData[2].ph[4] = -spData[2].xh[4] / 2;

k_c

**lij45o6** · 07-08-2021, 01:57 AM

MGP, any updates on getting a/any correction(s)?

**MGP** · 07-10-2021, 07:24 AM

Dogman, what do you mean by corrections?

I posted a long time ago I agree with Eric's analysis regarding the EV's being equal once they're dealt out. All I was saying was that there is an alternative, mathematically not-equivalent interpretation of hands that gets the same values. It's not just renaming. EV(N-N) is not the same as EV(N).

About the local minima, the thing is that when we determine a CDZ- strategy, we are taking the optimal decision at every card with the given shoe. So in order for the overall strategy to change the post-split hands need to:

1) Be different than the CDZ- strategy (not frequent but it can happen)
2) Have enough of an advantage AND a high enough probability of occurrence among ALL post-split hands that the effect of removal of the N an P cards that the two things together overcome the EV advantage of the CDZ- strategy. Note that x cards don't have any effect on the EVs.

I am also not positive that Eric and I are talking about the same problem. I am talking about basic strategy which does not deal with depleted shoes. If we are talking about the possibility of having a better strategy with a depleted shoe, that's obvious or counting wouldn't work. But the overall strategy doesn't change if the depleted cards are unknown. That's a mathematical fact so that is why basic strategy is always the best strategy to play when averaged over all depleted shoes starting with the same main shoe and unknown/random card removals.

**lij45o6** · 07-11-2021, 01:41 AM

Originally Posted by MGP

Dogman, what do you mean by corrections?

Woops! I meant: "MGP, any updates on getting a/any correction(s) to Nairn's paper?"

As in: have you reached out to get the necessary citations you/others are to get on his paper?

As an aside: do you know anyone that has worked on any algorithm(s) that improve computing dealer probabilities? Eric's algo is fantastic! If I remember correctly, he stated their may be a better way than all currently known methods. Like, we can go faster computing probabilities than what is known. I half-way looked at the problem, but have very limited maths knowledge to tackle it.

**ericfarmer** · 07-11-2021, 03:40 AM

Originally Posted by MGP

I posted a long time ago I agree with Eric's analysis regarding the EV's being equal once they're dealt out. All I was saying was that there is an alternative, mathematically not-equivalent interpretation of hands that gets the same values. It's not just renaming. EV(N-N) is not the same as EV(N).

Can you provide a link/reference to this long-ago post?

Originally Posted by MGP

About the local minima, the thing is that when we determine a CDZ- strategy, we are taking the optimal decision at every card with the given shoe. So in order for the overall strategy to change the post-split hands need to:

1) Be different than the CDZ- strategy (not frequent but it can happen)
2) Have enough of an advantage AND a high enough probability of occurrence among ALL post-split hands that the effect of removal of the N an P cards that the two things together overcome the EV advantage of the CDZ- strategy. Note that x cards don't have any effect on the EVs.

I am also not positive that Eric and I are talking about the same problem. I am talking about basic strategy which does not deal with depleted shoes. If we are talking about the possibility of having a better strategy with a depleted shoe, that's obvious or counting wouldn't work. But the overall strategy doesn't change if the depleted cards are unknown. That's a mathematical fact so that is why basic strategy is always the best strategy to play when averaged over all depleted shoes starting with the same main shoe and unknown/random card removals.

I think you've misunderstood my previous example. Given *any* subset of cards, your CA (and mine, and k_c's, etc.) can compute CDZ- strategy and corresponding overall expected return for a round dealt from that subset. And your algorithm (and mine, and k_c's, etc.) doesn't do anything "special," different, or otherwise magically better or more optimal if the input shoe subset just happens to be of the form (4d, 4d, 4d, 4d, 4d, 4d, 4d, 4d, 4d, 16d), right? My point is that the CDZ- strategy that we all know how to compute, and the corresponding overall expected return, is not (in general) the *best* strategy, because it's not the *largest* overall expected return, among all possible zero-memory strategies.

I think the confusion perhaps stemmed from the unfortunate fact that I provided an example of this where the input shoe *just happened* to not be a full shoe of that special form (4d, 4d, 4d, 4d, 4d, 4d, 4d, 4d, 4d, 16d). I only provided that depleted example shoe because I didn't have to look very hard for it

. That is, as explained in the earlier linked thread, I merely stumbled across that example in the course of mostly unrelated analysis. I already had it in my back pocket, so to speak.

At any rate, that example was apparently not getting the point across. So let me try again, with a different example that doesn't suffer the complication of being a depleted shoe:

Consider 1D, S17, DAS, SPL1, no surrender. We all agree on how to compute the zero-memory strategy that we call CDZ-, and we all agree that the corresponding overall expected return from a round, dealt from the top of the single deck, is 0.00153119996 (in fraction of initial wager). We also agree that when presented with a 6-2 vs. dealer 5, this CDZ- strategy dictates that we should hit (all the time, no matter whether we encounter this hand in the initial deal, or after splitting 2s or 6s).

But I claim that there is at least one better zero-memory strategy out there, that yields a better overall expected return: we should double down on 6-2 vs. dealer 5 instead of hitting-- again, *all the time*, per the constraint of specifying a zero-memory strategy-- yielding an improved overall EV of about 0.00153372.

I admit I'm confused about the confusion, so to speak-- this is really just a concrete data point (a second such data point at that) demonstrating why we qualify our "label" on this algorithm with that minus sign, which I thought we all agreed upon and understood way back when?

E

**lij45o6** · 07-11-2021, 04:00 AM

Originally Posted by ericfarmer

Consider 1D, S17, DAS, SPL1, no surrender. We all agree on how to compute the zero-memory strategy that we call CDZ-, and we all agree that the corresponding overall expected return from a round, dealt from the top of the single deck, is 0.00153119996 (in fraction of initial wager). We also agree that when presented with a 6-2 vs. dealer 5, this CDZ- strategy dictates that we should hit (all the time, no matter whether we encounter this hand in the initial deal, or after splitting 2s or 6s).

But I claim that there is at least one better zero-memory strategy out there, that yields a better overall expected return: we should double down on 6-2 vs. dealer 5 instead of hitting-- again, *all the time*, per the constraint of specifying a zero-memory strategy-- yielding an improved overall EV of about 0.00153372.

E

This is the CDZ strategy you alluded to before correct? If I am getting you correctly, we are computing a *new* post-split strategy that is dependent on the depleted shoe state post-split. That is : we are computing the CD perfect play on the hand [2, 6] vs 5 being aware of the extra 6 missing, making your strategy different than that of the pre-split strategy we encounter earlier.

**ericfarmer** · 07-11-2021, 06:22 AM

Originally Posted by dogman_1234

This is the CDZ strategy you alluded to before correct? If I am getting you correctly, we are computing a *new* post-split strategy that is dependent on the depleted shoe state post-split. That is : we are computing the CD perfect play on the hand [2, 6] vs 5 being aware of the extra 6 missing, making your strategy different than that of the pre-split strategy we encounter earlier.

No! This is a great question, that I think highlights the complexity of even specifying, let alone evaluating, strategy involving pair splits.

Let's clarify some definitions. First, by "zero-memory," I mean (although others may mean something different, for the purpose of this discussion I want to clarify what *I* mean) a strategy that specifies whether to stand/hit/double/split/surrender, as a function of the current player hand and dealer up card... and *only* the current hand and up card. That is, a zero-memory strategy only "knows" the cards in the current hand, not whether that hand was part of the initial deal, or is one of multiple split hands, etc.

Second, "CDZ-" is just one common conventionally-understood notation referring to a *particular* zero-memory strategy, that is determined as follows: (1) compute the composition-dependent strategy that optimizes EV... temporarily ignoring/prohibiting the possibility of pair splitting. (Note that by prohibiting pair splitting, we can all agree on an efficient means of computing this strategy, and furthermore, we can all truly claim that the corresponding EV is actually optimal among all possible composition-dependent strategies.) Then (2) compute the EVs for splitting all possible pairs, assuming that we (a) split and resplit at every opportunity, and (b) use the strategy already computed in step (1) for any other hands encountered "post-split." And finally (3) compute overall EV for the round, where for each initially dealt hand we choose the playing strategy that maximizes the EV computed in step (1) or (2) as appropriate.

Coming back to this 1D example, in the CDZ- strategy we hit 6-2 vs. dealer 5. That CDZ- strategy also specifies player actions for all other possible hands and dealer up cards. Now, let's call "CDZ*" (note the asterisk) the strategy that dictates, "Follow CDZ- in all situations... except that when you encounter 6-2 vs. dealer 5, always double down instead of always hitting." (Note that CDZ* is not any sort of standard notation, I just made it up for the purpose of this discussion.)

We can efficiently compute the overall expected return from playing this modified CDZ* strategy. And we happen to find that this expected return is greater than the expected return from CDZ-. That is, abusing notation somewhat, the above example demonstrates that, for this shoe and these rules, E[CDZ-]<E[CDZ*] (note that inequality is strict).

However, finally getting to your question

, there is yet another third strategy of interest, called CDZ (note there are no minus signs, asterisks, or other qualifiers), that is common conventionally-understood notation referring to the playing strategy that yields the maximum possible overall expected return (for the given shoe subset, in this example a full single deck)-- that is, maximum EV subject to the constraint that it is zero-memory.

What is this CDZ strategy? I don't know. In this specific single-deck S17 example, maybe it's CDZ*. But maybe not-- how do we know that we can't further improve overall EV by making *two* changes to CDZ-, or three, or four, etc., instead of just the *single* modification to strategy with 6-2 vs. 5? For example, I searched for examples like this one by evaluating CDP1 strategy (details are for another post, but essentially relaxing the zero-memory constraint to allow a different strategy pre vs. post-split), and applying individual differences to CDZ- (post- *and* pre-split). But instead of just trying *singleton* subsets of this collection of candidate modifications, it's possible that other subsets of modifications might "collaborate" to improve overall EV further still.

In other words, in this case, we know E[CDZ-]<E[CDZ*] (from explicit calculation), and we know E[CDZ*]<=E[CDZ] (by definition, that is, E[S]<=E[CDZ] for *all* possible zero-memory strategies S), and so by transitivity we know that E[CDZ-]<E[CDZ] (that is, all of the available CAs that we know about are suboptimal, hence the minus sign). But we *don't* know whether E[CDZ*]=E[CDZ].

E

**DSchles** · 07-11-2021, 05:18 PM

Is 6,2 vs. 6 in SD H17 also better to double for CDZ*?

Don

**k_c** · 07-11-2021, 07:09 PM

Originally Posted by ericfarmer

No! This is a great question, that I think highlights the complexity of even specifying, let alone evaluating, strategy involving pair splits.

Let's clarify some definitions. First, by "zero-memory," I mean (although others may mean something different, for the purpose of this discussion I want to clarify what *I* mean) a strategy that specifies whether to stand/hit/double/split/surrender, as a function of the current player hand and dealer up card... and *only* the current hand and up card. That is, a zero-memory strategy only "knows" the cards in the current hand, not whether that hand was part of the initial deal, or is one of multiple split hands, etc.

Second, "CDZ-" is just one common conventionally-understood notation referring to a *particular* zero-memory strategy, that is determined as follows: (1) compute the composition-dependent strategy that optimizes EV... temporarily ignoring/prohibiting the possibility of pair splitting. (Note that by prohibiting pair splitting, we can all agree on an efficient means of computing this strategy, and furthermore, we can all truly claim that the corresponding EV is actually optimal among all possible composition-dependent strategies.) Then (2) compute the EVs for splitting all possible pairs, assuming that we (a) split and resplit at every opportunity, and (b) use the strategy already computed in step (1) for any other hands encountered "post-split." And finally (3) compute overall EV for the round, where for each initially dealt hand we choose the playing strategy that maximizes the EV computed in step (1) or (2) as appropriate.

Coming back to this 1D example, in the CDZ- strategy we hit 6-2 vs. dealer 5. That CDZ- strategy also specifies player actions for all other possible hands and dealer up cards. Now, let's call "CDZ*" (note the asterisk) the strategy that dictates, "Follow CDZ- in all situations... except that when you encounter 6-2 vs. dealer 5, always double down instead of always hitting." (Note that CDZ* is not any sort of standard notation, I just made it up for the purpose of this discussion.)

We can efficiently compute the overall expected return from playing this modified CDZ* strategy. And we happen to find that this expected return is greater than the expected return from CDZ-. That is, abusing notation somewhat, the above example demonstrates that, for this shoe and these rules, E[CDZ-]<E[CDZ*] (note that inequality is strict).

However, finally getting to your question

, there is yet another third strategy of interest, called CDZ (note there are no minus signs, asterisks, or other qualifiers), that is common conventionally-understood notation referring to the playing strategy that yields the maximum possible overall expected return (for the given shoe subset, in this example a full single deck)-- that is, maximum EV subject to the constraint that it is zero-memory.

What is this CDZ strategy? I don't know. In this specific single-deck S17 example, maybe it's CDZ*. But maybe not-- how do we know that we can't further improve overall EV by making *two* changes to CDZ-, or three, or four, etc., instead of just the *single* modification to strategy with 6-2 vs. 5? For example, I searched for examples like this one by evaluating CDP1 strategy (details are for another post, but essentially relaxing the zero-memory constraint to allow a different strategy pre vs. post-split), and applying individual differences to CDZ- (post- *and* pre-split). But instead of just trying *singleton* subsets of this collection of candidate modifications, it's possible that other subsets of modifications might "collaborate" to improve overall EV further still.

In other words, in this case, we know E[CDZ-]<E[CDZ*] (from explicit calculation), and we know E[CDZ*]<=E[CDZ] (by definition, that is, E[S]<=E[CDZ] for *all* possible zero-memory strategies S), and so by transitivity we know that E[CDZ-]<E[CDZ] (that is, all of the available CAs that we know about are suboptimal, hence the minus sign). But we *don't* know whether E[CDZ*]=E[CDZ].

E

CDZ- forces pre-split strategy to post split hands. I only use CDZ- as a basic strategy option (i.e. player is forced to define a basic strategy for a cd hand vs. up card without considering any post split removals. I don't consider any other removals either, since I am just using this to define a basic full shoe strategy.)

The way I get a strategy more optimal than this is to adopt the optimal post split strategy of the first split hand for subsequent split hands. So for 2-6 versus 5 the full shoe strategy is to hit if the hand is not a split hand and double if the hand is a result of splitting either a 2 or a 6 (if DAS.) I believe you have named this strategy either CDP or CDP1 and I know your CA can adopt it. The full shoe single deck S17 DAS SPL3 overall EVs are ~+.1819% for CDZ- and ~+.1831% for using optimal strategy of first split hand. I think that a lot of the difference is due to 2-6 versus 5 post split. It doesn't look like there's a lot of room to improve CDZ- that much by forcing a strategy that is contrary to the original pre-split strategy.

Originally Posted by DSchles

Is 6,2 vs. 6 in SD H17 also better to double for CDZ*?

If H17, full single deck,
2-6 versus 6 dealt from top of deck - cd strategy is hit
If 2-6 versus 6 is a result of splitting 6-6, cd strategy is to hit using optimal strategy of first split hand.
If 2-6 versus 6 is a result of splitting 2-2, cd strategy is to double using optimal strategy of first split hand (if DAS.)

I'll leave it to others to determine which strategy is best if used for all occasions, pre and post split.

k_c

**DSchles** · 07-12-2021, 08:55 AM

Originally Posted by k_c

If H17, full single deck,
2-6 versus 6 dealt from top of deck - cd strategy is hit
If 2-6 versus 6 is a result of splitting 6-6, cd strategy is to hit using optimal strategy of first split hand.
If 2-6 versus 6 is a result of splitting 2-2, cd strategy is to double using optimal strategy of first split hand (if DAS.)

I'll leave it to others to determine which strategy is best if used for all occasions, pre and post split.

Kind of thought this would be a tight one! And, frankly, guys, I don't think there are going to be any other examples.

The closest BS play in all of BJ is to double A,2 vs. 5 in eight-deck. But the logic we're applying doesn't permit any post-split changes for that play.

Don

**ericfarmer** · 07-12-2021, 03:22 PM

Originally Posted by k_c

If H17, full single deck,
2-6 versus 6 dealt from top of deck - cd strategy is hit
If 2-6 versus 6 is a result of splitting 6-6, cd strategy is to hit using optimal strategy of first split hand.
If 2-6 versus 6 is a result of splitting 2-2, cd strategy is to double using optimal strategy of first split hand (if DAS.)

I'll leave it to others to determine which strategy is best if used for all occasions, pre and post split.

k_c

To respond to both this and Don's question, for 1D H17 DAS SPL1, we can't improve CDZ- by doubling down on 2-6 vs. dealer 6. The overall expected return from CDZ- is about -0.000374835, while the return from this single strategy modification is worse, about -0.000380571.

Note that in making this hypothetical strategy change, we have to *always* double down, no matter whether we encounter 2-6 in the initial deal, after splitting 2s, or after splitting 6s. As k_c points out, there are other options, like CDP1 (that k_c and my CAs support), as well as CDP (which mine does as well as MGP's) or CDPN (which only MGP's does). But the contrast in those other cases is that they aren't zero-memory, at least per my earlier definition: they allow strategy to vary depending on knowledge of whether we're in a post-split hand. (Indeed, I argue that CDP as well as CDPN are really of only academic interest, since it isn't actually possible for a player to *realize* those playing strategies at the table. CDP1, at least, would be merely more complex to memorize.)

But even focusing on trying to improve CDZ-, note that we need not restrict attention to modifying strategy just for two-card hands. There are roughly one *thousand* post-split strategy modifications in CDP1, most of which involve multi-card (>2) hands. It's possible-- but computationally unpleasant (to say the least

) to either verify, or refute, that not only *one* of those strategy modifications might alone be an improvement, but that some *subset* of them might. (<conjecture>Indeed, intuitively I would expect to *need* multiple of those multi-card hand strategies to be modified to have a chance at yielding improvement, since we need to "touch" multiple hands to affect not just EV "directly," but also indirectly by affecting the *weighting*, i.e. the probabilities of encountering those multi-card hand situations, and those downstream as well.</conjecture>)

E

**ericfarmer** · 07-12-2021, 03:28 PM

Forgot to mention-- if others are interested in experimenting with this, below is an example of using my CA to (1) compute CDZ- strategy and expected return, then (2) compute expected return for the modified strategy to force doubling down on 2-6 vs. dealer 5 (in the original motivating example of 1D S17):

Code:

#include "blackjack.h"
#include <iostream>

class Strategy : public BJStrategy
{
public:
    Strategy(BJStrategy *strategy) : BJStrategy(), strategy(strategy) {}
    virtual ~Strategy() {}

    virtual int getOption(const BJHand & hand, int upCard, bool doubleDown,
        bool split, bool surrender)
    {
        if (upCard == 5 && hand.getCards() == 2 &&
            hand.getCards(2) == 1 && hand.getCards(6) == 1)
        {
            return BJ_DOUBLE_DOWN;
        }
        return strategy->getOption(hand, upCard, doubleDown, split, surrender);
    }
    BJStrategy *strategy;
};

int main()
{
    BJShoe shoe(1);
    BJRules rules(false, true, true, true, false, true, false, false, false);
    BJStrategy cdz;
    BJProgress progress;
    BJPlayer *cdz_player = new BJPlayer(shoe, rules, cdz, progress);
    Strategy test(cdz_player);
    BJPlayer *test_player = new BJPlayer(shoe, rules, test, progress);
    std::cout << "E[CDZ-] = " << cdz_player->getValue() << std::endl;
    std::cout << "E[CDZ*] = " << test_player->getValue() << std::endl;
    delete cdz_player;
    delete test_player;
}

**iCountNTrack** · 07-12-2021, 04:15 PM

Originally Posted by DSchles

Kind of thought this would be a tight one! And, frankly, guys, I don't think there are going to be any other examples.

The closest BS play in all of BJ is to double A,2 vs. 5 in eight-deck. But the logic we're applying doesn't permit any post-split changes for that play.

Don

5,A vs 5 is much tighter (8D, H17). Hit is 0.04515323499 ± 0.9734323482, double is 0.05391245176 ± 1.948456553. 5,A is not really practical for this problem, next tight one is 4,A vs 5 hit is 0.06931192942± 0.9699537217, double is 0.05780555482 ± 1.948401338