Page 1 of 2 12 LastLast
Results 1 to 13 of 22

Thread: MJ: Standard Error in Simulations - Are You Being Fooled?

  1. #1
    MJ
    Guest

    MJ: Standard Error in Simulations - Are You Being Fooled?

    If you run two simulations that are EXACTLY identical for a given counting system, you will notice that performance is not the same when comparing the results. The statistics will differ slightly depending upon sample size (# rounds simulated). Standard error gives the user an idea of the magnitude of error associated with the simulation. Another way to put it is standard error is a means of quantifying the level of precision correlated with a simulation.

    The concept of standard error is important for several reasons. When comparing two different count systems with similar performance, then it might be possible for one system to outperform another purely by chance (luck)! Let us look at a simple example.

    Suppose we simulate 2 Billion Rounds on a simulator.
    Spread, Rules, etc. are all identical. The only difference is the counting systems themselves.

    Results:
    Count system A: SCORE $51
    Count system B: SCORE $50

    Can we now conclude that system A is the stronger system? Well, maybe, maybe NOT!! You have to remember there will always be standard error associated with any simulation(s). The question we must ask ourselves is "What is the standard error associated with the SCORES for the aforementioned results"?

    Suppose the theoretical mean SCORE for system A is $50 with a standard error of $1. Thus, in the simulation, the SCORE for system A worked out to be 1 standard deviation to the right of the mean (51-50=1).

    Suppose the theoretical mean SCORE for system B is $51 with a standard error of $1. Hence, in the simulation, the SCORE for system B was 1 standard deviation to the left of the mean(51-50=1).

    The consequence of all this is that the simulations yielded BOGUS results due to standard error! Judging by the mean SCORES, system B actually outperforms system A by $1!

    How often can inaccurate results such as these occur? Well, the probability of being 1 SD to the left or right of the mean is 15.86% for each simulation. So, 15.86% x 15.86% = 2.51%. In other words, for every 100 simulations we run for each system, roughly 2.5 of them will produce misleading SCORES similar to the results given above!

    Standard error can also give us CONFIDENCE in our results. Suppose system A and system B each has a standard error of only $0.10 for each simulation. Now, even if system A's results were 3 SDs to the left of the mean, that gives a SCORE of $50.70. Similarly, if system B were 3 SDs to the right of the mean, the SCORE would be $50.30. There is still such a wide gap in the results that we can be virtually assured that the results observed are NOT due to chance.

    In closing, I would urge all simulator writers to include the standard error for various performance statistics including SCORE, WR, and EV. Omitting this valuable information is affectively rendering the user blind, leaving him with no idea of the magnitude of error associated with the simulation. Simply saying, "just simulate a large # of rounds and hope for the best" is grossly inadequate. While some may argue my aforementioned examples are unrealistic, they were just intended to MAKE A POINT, and there are many variations on the theme.

    MJ

  2. #2
    Double21
    Guest

    Double21: Re: Standard Error in Simulations - Are You Being Fooled?

    This looks to me like you are attempting to measure with a micrometer what was cut off with an axe.

  3. #3
    MJ
    Guest

    MJ: Re: A better analogy

    > This looks to me like you are attempting to measure
    > with a micrometer what was cut off with an axe.

    Suppose I wanted to measure the length of my driveway to the nearest meter. I use a measuring stick and determine it to be around 15 meters. Assuming I round to the nearest meter, I realize the actual length of the driveway is in the range of 14.5 to 15.49 meters. In short, the measurement cannot be flawed by more than +/- 1/2 meter.

    I simply want some idea of the level of precision associated with a simulation so I can better understand the range of possibilities for the performance stats.

    Make sense?

    MJ

  4. #4
    Sun Runner
    Guest

    Sun Runner: Excellent analogy !

    > This looks to me like you are attempting to measure
    > with a micrometer what was cut off with an axe.

    I basically asked the same thing around here several years ago. My question was more along the lines of what was the point of extending some calcs out to say four or five decimal places!? The answer given, and I think it a fair and valid one was "because we can." I think DS said that.

    Leave the math guys alone. They are happy coming up with solid, reliable, playable answers to questions I never could have and I'm quite happy to ride along in relative ignorance while the only real thanks they get is other math guys grinding their gears if they make a mistake.

    Your example, for guys like me, is right on. But I appreciate more than words will ever tell those that toil away grinding this stuff out .. to four decimal places .. with micrometers.

  5. #5
    21forme
    Guest

    21forme: Re: Excellent analogy !

    A 2% difference in SCORE from a billion hand simulation, even without SE numbers, is insignificant compared with the the number of hands you will play in your lifetime.

  6. #6
    Norm Wattenberger
    Guest

    Norm Wattenberger: Accuracy and presentation

    > I basically asked the same thing around here several
    > years ago. My question was more along the lines of
    > what was the point of extending some calcs out to say
    > four or five decimal places!? The answer given, and I
    > think it a fair and valid one was "because we
    > can." I think DS said that.

    I've talked about this before. When you present a number to 20 decimal places, that should mean you believe the number to be correct to 20 decimal places. Now that is true, for example, with numbers like Cacarulo's CA calculations since they are exact. Sim results cannot be that accurate. That's why we use sims only when CA calculations are not possible. So, when a sim result is presented with say 8 decimals; the answer is inaccurate if it cannot be known close to that many places. My physics teacher in tenth grade was more direct. He said that a number displayed to more accuracy than is possible by the measurement technique is "wrong."

    Now I used to give only two decimal places on calculated stats that were only accurate to two places. I've slowly been convinced to show one or two more decimal places so people can see movement as sims progress. But I don't provide numbers to ten decimals that I know are only accurate to two. Gotta draw a line somewhere.

  7. #7
    Norm Wattenberger
    Guest

    Norm Wattenberger: Some thoughts

    1. You do realize that Standard Error is an estimate of an estimate?

    2. I?ve never seen SE of SCORE presented by anyone. Does this mean all SCORE sims are not to be trusted?

    3. I?ve always found humor in political polls when you see two polls one which claims 50% and another 40% and they both claim +/- 3% based on standard error.

    Apologies; I?m finished making fun of your request. I understand how this would increase your comfort level. When we used to use EV as the measure of a sim; we often quoted SE. And SE of EV is presented by CVData. And SE of EV by TC used to be displayed by CVCX. But it was always zero and we thought there was better use of the space. But I?m not sure how we would go about creating an ?accurate? estimate of SE for SCORE since SCORE itself is based on standard deviation. Since the mean of BJ is near zero; standard error calcs of EV are greatly simplified. But, the mean of SCOREs are hopefully not zero.

    Suggestions on calcing the SE of SCORE without two passes are welcome.

  8. #8
    MJ
    Guest

    MJ: Re: Some thoughts

    > 1. You do realize that Standard Error is an estimate
    > of an estimate?

    I do realize that SCORE is an estimate and will always have a bit of variance from one sim to another. I suppose SE is a means of estimating the precision of the estimated SCORE.

    > 2. I?ve never seen SE of SCORE presented by anyone.

    Weak point. Prior to CVCX 4.0, I never saw a post-sim calculator that could switch between 1 and 2 hands. Does that make it 'wrong' to include this feature?

    > Does this mean all SCORE sims are not to be trusted?

    Man, your missing the point! Did you read what I wrote?
    SE is like a double edged sword. It can cast DOUBT on any conclusions drawn when comparing performance or it can be used to VALIDATE the conclusions drawn when comparing performance. Here is an excerpt from what I wrote:

    "Standard error can also give us CONFIDENCE in our results. Suppose system A and system B each has a standard error of only $0.10 for each simulation. Now, even if system A's results were 3 SDs to the left of the mean, that gives a SCORE of $50.70. Similarly, if system B were 3 SDs to the right of the mean, the SCORE would be $50.30. There is still such a wide gap in the results that we can be virtually assured that the results observed are NOT due to chance".

    So, it really depends upon the size of the SE for each sim and the difference in performance between the sims.

    All I am trying to say is that a good scientist/researcher understands the limitations of his measurements. No measurement is perfect. SE can give us some idea of the imprecision of our measurements.

    > Apologies; I?m finished making fun of your request.

    Uncalled for. These condescending remarks will only deter your customers from making requests in the future. But hey, whatever floats your boat.

    I didn't limit my suggestions to just SCORE. What about WR?
    If you can come up with SE for EV, why can't you do so for WR? Isn't WR based upon EV?

    MJ

  9. #9
    Norm Wattenberger
    Guest

    Norm Wattenberger: Re: Some more thoughts

    My comments were certainly not meant to be condescending. Well maybe to the poll takers that are often far more inaccurate than they claim. I was merely trying to point out that your posts leave the impression that all the sims run by me, Cacarulo, MathProf, Karel for years are all suspect. CVData pumps out 60,000 stats. I can't add 60,000 SEs. There is also a question as to how one would go about calculating SE for WR and SCORE. EV SE is easy. WR and SCORE are problematic.

    Having said that, I have an idea on estimating SE for WR and SCORE and will kick it around with Don.

  10. #10
    Sun Runner
    Guest

    Sun Runner: Re: Accuracy and presentation

    You are making my point precisely, and thank God you do what you do.

    > My physics teacher in tenth grade
    > was more direct. He said that a number displayed to
    > more accuracy than is possible by the measurement
    > technique is "wrong."

    Totally agree and it dosen't take a physics teacher to know it.

    > Now I used to give only two decimal places on
    > calculated stats that were only accurate to two
    > places. I've slowly been convinced to show one or two
    > more decimal places so people can see movement as sims
    > progress.

    I'd guess I'd ask .. why? You already said they were not accurate. Who wants to see movement in in-accurate numbers?

    And further, for me and the rest of us dorks, who cares about seeing movement in the 5th decimal place? I'm playing with an at best (smallest) $5 fixed unit of measure. I bet there are not ten people who could, and do, TC past one decimal. Me!? forget about it man.

    But again, thanks for what you do and please don't stop! If you do I'll be back to betting "one for bad, two for good."

    Take care.

  11. #11
    Sun Runner
    Guest

    Sun Runner: Re: Some thoughts

    >> Apologies; I?m finished making fun of your request.

    > Uncalled for. These condescending remarks will only
    > deter your customers from making requests in the
    > future. But hey, whatever floats your boat.

    Lighten up dude, a simple 'apology accepted' seems adequate.


  12. #12
    MJ
    Guest

    MJ: Re: SE of WR

    > My comments were certainly not meant to be
    > condescending. Well maybe to the poll takers that are
    > often far more inaccurate than they claim. I was
    > merely trying to point out that your posts leave the
    > impression that all the sims run by me, Cacarulo,
    > MathProf, Karel for years are all suspect.

    That certainly was not the intended purpose. I was just trying to give folks an example where SE can throw a curve ball. Judging by responses I have received, it seems like SE for 1 billion rounds is probably under a buck for SCORE. ETF says a $1 SE for SCORE is far too high in a billion round sim.

    While there may not be an established method to calculate SE for SCORE, ETF says SE of Hourly WR can be calculated in a matter of seconds with a simple calculator; whether this is correct or not I do not know.

    SE WR/Hr = SD of (WR/Hr)/(sq. root #rounds simulated)

    > CVData
    > pumps out 60,000 stats. I can't add 60,000 SEs. There
    > is also a question as to how one would go about
    > calculating SE for WR and SCORE. EV SE is easy. WR and
    > SCORE are problematic.

    All I'm saying is consider adding 1 optional column called SE/WR next to the WR column in CVCX. Ideally, it should be dynamic like all the other columns. Forget SE/SCORE, there are better things to work on. Unfortunately, I get the impression this is purely academic.

    Also, when I get around to it there is a lot to be said regarding the widgets, most of which do not take expenses into account. These should definitely be updated to give users an accurate assessment of how expenses affect all of the other widgets, and even their optimal betting ramp.

    Unless you travel for free or live in a casino, there are expenses associated with your play. That one expense widget is fantastic, unfortunately all of the other widgets seem to disregard it. I still can't believe how many extra trips I need to make in order to double my bank just b/c of expenses!! If you ever figure out how to integrate expenses into the other widgets and/or optimal betting ramp, that would very useful.

    MJ

  13. #13
    Norm Wattenberger
    Guest

    Norm Wattenberger: Re: SE of WR

    Why would anyone start a sentence with the words "ETF says?":-) If you had listened to him; you wouldn't even be running my software since ET Fan said that you risk damage to your PC by simply installing it. He also says I use a poor RNG I've never used in my 40 years programming, that CV products don't support dozens of things they do and I've said and thought dozens of things I've never said or thought. Of course I'm barred from responding to his posts and "articles."

    On Expenses; it is on my list to expand this concept. But I have yet to determine an accurate method.


Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

About Blackjack: The Forum

BJTF is an advantage player site based on the principles of comity. That is, civil and considerate behavior for the mutual benefit of all involved. The goal of advantage play is the legal extraction of funds from gaming establishments by gaining a mathematic advantage and developing the skills required to use that advantage. To maximize our success, it is important to understand that we are all on the same side. Personal conflicts simply get in the way of our goals.