Martingale Approach to Achieve Certain Occurrence of Sequence Patterns
A classic probability problem: repeatedly toss a coin until consecutive pattern HTHT shows up, what’s the expected number of tosses?
The normal solution to this problem is to use markov chains to calculate the expected time of the pattern to show up. However, there’s another more efficient way. This way was first came out in 1980 by Shuo-Yen Robert Li in his paper A Martigale Approach to the Study of Occurrence of Sequence Patterns in Repeated Experiments. This post will introduce Li’s idea briefly.
In Li’s paper, to solve this problem, he constructed a gambler game and I will use the HTHT pattern to explain his approach.
Let’s assume there’s a gambler with 1 dollar coming to a casino and his goal is to have pattern HTHT:
- On his first day, he bet his first toss to be H with his 1 dollar. If he wins, he could stay in the game and double his entrance fee, which is 2 dollars now. If he loses, he loses all of his money and leaves the game.
- On his second day, he bet his second toss to be T with all of his money. The rule stays the same: double his money to 4 dollars or lose everything.
- On his third day, he bet his third toss to be H with all of his money. The rule stays the same.
- On his fourth day, he bet his third toss to be T with all of his money. The rule stays the same.
The essence of this game is that, every day for the casino and the gambler, the game is fair. The expectation of the game is 0 because everyday, gambler has $\frac{1}{2}$ probability of doubling his money and $\frac{1}{2}$ probability of losing all of his money.
Here, Li’s idea is, there’s a gambler team and everyday, they send a person to the casino and each game is the same as the game we described above. Every gambler is independent of each other. We use ${X_n,n=0,1,2,\ldots}$ to represent after $n$th day the net income of the gambler team, and $X_0=0$. Since the game is fair, we know ${X_n}$ is a martingale.
If $\tau$ is the expected number of tosses, then $\tau$ is a stopping time. According to the Doob’s Identity, if ${X_n, n=0,1,2,\ldots}$ is a martingale and $\tau$ is a stopping time, then $EX_{\tau \wedge n}$ = $EX_0$ for any $n$. So, for our question here, $EX_{\tau}$ = $EX_0$ = 0. Obviously, we could get: $X_{\tau} = W - \tau$, where $W$ is the sum of all the money all the gamblers have at the end of $\tau$th day and $\tau$ represents the entrance fees the team brought into the casino.
Back to the HTHT game, we could calculate the sum of the money of all the gamblers who are still in the game in order to know the expectation of $\tau$. We know that only $\tau - 3$th day to $\tau$th day are still in the game since $\tau$th day is the first time when the pattern HTHT shows up. The gambler who came to the casino is the luckiest guy since he’s got all the tosses correct so he has 16 dollars; for the gambler who comes to the casino on the $\tau - 1$th day got his first two tosses correct so he has 4 dollars. For gamblers who comes to the casino on the $\tau - 2$th day and $\tau$th day, they came and tossed T first so their tosses didn’t meet with the pattern and they lost all of their money. In all, there are 20 dollars in the gamblers’ hands so $E\tau$ = 20, which represents the expected number of tosses we need to get the correct pattern.
It may not seem that obvious that this approach is more efficient but for longer pattern such as 6Hs, this approach shows its edge. If we use markov chain to do this problem, it is long and tedious but if we use this way, $E\tau$ = $2^6 + 2^5 + 2^4 + 2^3 + 2^2 + 2$ = 126.