tossing biased cyber coins



I decided to test this comment to the previous blog post with a numerical simulation and the result was quite surprising (at least to me).

I wrote a simple C program, which generates n-state HMMs randomly (*) and then runs them N times (generating a sequence HTHH...), followed by another N cyber coin tosses.

The question was if a 'simple strategy' can predict from the first sequence the bias of the second sequence. In the following, I performed the experiment for n = 10, 20 ...
with N = 100.

The 'simple strategy' predicts a bias for Head if the number of Heads in the first sequence exceeds 60 = 0.6*N and I registered a success of the prediction if the
number of Heads was indeed larger than 50 = 0.5*N in the following sequence.
The experiment was repeated 10000 times for each n and the graph below shows the success rate as a function of n.







Notice that the success rate is well above 50% for n < N, but even for n > N it seems that the first sequence can predict the bias of the second to some extent.
This is quite amazing, considering that for n > N the first sequence has not even encountered all the possible states of the HMM.

Obviously, as n increases (and N stays fixed) the success rate approaches 50%
and if one does not know n this leads us back to the questions raised in the previous post. But the success rate for n < N and even n > N is much higher than
I would have expected.

The next task for me is to double check the result (e.g. against the literature) and to do some more experiments.



------



(*) An HMM is really described by two tables. The one stores the probabilities for H vs. T in each state s = 1, ..., n. The program initializes these probabilities with uniform random numbers between 0 and 1.

The other table stores the n x n transition probabilities and the way my program assigns these probabilities is to first assign uniformly distributed random numbers and then normalizing probabilities by multiplying the probilities p(i,j) to get from state i to j so that sum_j ( p(i,j) ) = 1. There is some ambiguity in this procedure
and I guess one could choose a different measure.


No comments: