Excellent question. This delves into the dynamics of Bayesian learning in a continuous-time setting, a core topic in filtering theory. Your approach using Itô's lemma is the right one, but there seems to be a subtle error in your derivation that led to the confusing conclusion about "orthogonality".
Let's break down the problem and address your questions systematically.
First, let's establish the standard framework and notation to ensure clarity.
States and Prior: Two states, and . Let be the random variable for the true state. , .
Observations: The observed process is , where is either or depending on the true state . The process is a standard Brownian motion under the true measure.
Posterior Probability: Let be the posterior probability of state given the observations up to time , where . As you correctly noted, can be expressed using the likelihood ratio . The formula is indeed:
The Mixture Measure: The expectation in your question should be taken with respect to the mixture measure (also called the prior predictive measure), which we can denote as . Under this measure, the process is a mix of the two possibilities: . A key property, as you used, is that the posterior probability process is a martingale under , so for all .
Correcting the SDE for the Posterior
The most direct way to analyze the evolution of functions of is to use the Kushner-Stratonovich equation, which gives the SDE for the posterior probability . For this problem, the equation is:
where is the innovations process, defined as:
The innovations process is a standard Brownian motion under the mixture measure . The SDE for elegantly shows it's a martingale under because it has no drift term.
Question 1: How does evolve?
Let's define the quantity you're interested in:
We can express this in terms of the posterior :
Comparing this with your , we see that:
Therefore, the question is equivalent to finding the time evolution of .
Let's find the evolution of . We can apply Itô's lemma to using the Kushner-Stratonovich SDE:
Using where :
So,
Now, take the expectation . The expectation of the Itô integral term (the term) is zero.
This gives us a differential equation for the second moment of the posterior.
Now, let's look at your quantity :
Since is constant, its time derivative is zero.
Substituting our result for the evolution of :
This is the explicit description of how evolves. It's a differential equation, but it is not "closed" — the derivative of a quantity related to the second moment () depends on the fourth moment (). This hierarchy of moment equations is very common in nonlinear filtering. While we don't have a simple closed-form solution, this equation accurately describes the dynamics.
Note that the right-hand side is always non-positive, meaning is a decreasing function of time (as long as ). This makes intuitive sense: as we collect more informative data, our average uncertainty about the state should decrease.
Question 2: Does have a Bayesian interpretation?
Yes, it has a very important and direct interpretation. As shown above:
The term is the posterior variance of the state indicator. Let be an indicator variable which is 1 if and 0 if . Then:
The posterior mean of is .
The posterior variance of is .
This quantity measures our uncertainty about the true state after observing the data up to time . It is maximized at (maximum uncertainty) and is zero at or (certainty).
So, your quantity is simply the posterior variance, scaled by the constant . Consequently, is the expected posterior variance (scaled). This is a central quantity in areas like Bayesian experimental design, where one aims to design an experiment to minimize this value.
Question 3: Why should it be orthogonal to ?
This conclusion appears to stem from a calculation error. In your derivation, you wrote:
And since , you concluded that the expectation must be zero.
The error lies in the derivation of this equation from the Itô formula. The drift and quadratic variation terms for are more complex and depend on which measure you are working under. When done correctly (under the mixture measure as shown above), no such relationship emerges.
Let's look at the quantity you claim is zero in expectation:
Since is a random variable between 0 and 1, the term is non-negative. Its expectation will be zero only in trivial cases (e.g., if or , or if the process has already converged to 0 or 1). In general, this expectation is strictly positive.
The "orthogonality" you discovered is an artifact of an incorrect application or formula in the Itô calculus step. The posterior variance is not, in general, "orthogonal" to the posterior probability of the other state in this sense.
Quick Feedback
Was this answer helpful?
Additional Information:
version: June 2025 release
UQ Validation
Status: UQ Validated
Validated: 8 months ago
Human Verification
Status: Failed Human Verification
Verified: 7 months ago
UQ Validation
Validation from UQ Validators
Model Reviews (0)
Reviews and evaluations from AI model experts
Loading model reviews...
Human Reviews (0)
Expert reviews and evaluations from human reviewers