Recently, I have taken an interest in building a blackjack simulator for personal use/analysis. Long term is to get it to compute accurate indices for both a single parameter system and any accompanying side count parameters (multi-parametric analysis.) For a while, it seemed like it was to be an easy endeavour: wrong!

So, right now, I am trying to figure out how I am to approach a simple hit/stand algorithm. Yeah, something so basic as to generating a TD strategy chart for hitting and standing is now working for me right now. The way I was approaching the hit/stand table generation was to use a vacillating procedure. Basically, the program assumes we will stand on all player totals for every dealer up card. The player will stand, and the dealer will draw. If the overall expectation for that round is positive, the computer assumes we will stand again when that specific player total and dealer up card combo appears again. However, if the overall expectation is negative, the computer will assume that we must hit, and switches the preferred action from 'S' to 'H'. So that next time we get that specific player total and dealer up card combo, we will hit.

*Reference: Assumes single deck, stand on all 17's. Reshuffle after every round.*

When the computer hits, three things will happen here: 1.) We will draw again; 2.) We will stand; 3.) We bust. If the player draws, and when the player either stands or busts, the overall expectation of that round will be carried down from previous draws. To better illustrate what I was doing:

Round 0:

Player

5 6

Dealer

6 T

*compute draws dealer

Player

5 6

Dealer

6 T 5

Since the dealer beat the player, standing 11 vs 6 is a current EV of -1.0.

Round 1:

Player

5 6

Dealer

6 T

*player draws

Player

5 6 5

Dealer

6 T

The player will stand and the dealer will draw a card

Player

5 6 5

Dealer

6 T A

Since the dealer beats the player, standing 16 vs 6 has a current EV of -1.0

Since hitting 11 vs 6 produces a loser due to super-sequent hands, hitting 11 vs 6 produces a current EV of -1.0

Now, over time, the EV for both standing and hitting all player/dealer match-ups should converge to their global EV's and produce the correct strategy chart when we compare standing and hitting EV's. Correct? If so, the above method proved to be..er, incorrect? What was happening was that player hard 5 and 6 vs dealer 4 5 6 would indicate that the optimal action is to stand, that standing hard 12 vs dealer 2 3 is optimal, and that standing on all pairs of twos for all dealer up cards is optimal. This is absurdly wrong and cannot be accurate when comparing to currently computed TD strategies.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

So now, I am at a point where I am trying to figure out *how* to accurately compute the correct EV's for hitting and standing. Using the algorithm outlined in Griffins ToBJ was an idea, but that is for Combinatorial Analysis, not simulation. I could try to compute the EV's for standing on all player hands, and work on computing hitting based on a weighed sum of the probability of a draw card with respect to the next player total's optimal action expectation. Similar to that of a CA, but with a monte carlo method.

Anyone have an idea that is the best approach to getting past this hurdle? Feeling rather limited in possible solutions. ]]>