No – a small amount of mutual information does not force the joint support to contain a large Cartesian product, and, in the equivalent graph‐theoretic language, a dense bipartite graph need not contain a large complete bipartite sub-graph (biclique).
A single probabilistic construction already shows this.
Write
N := 2ⁿ (the size of {0,1}ⁿ).
We build a bipartite graph G = (L,R,E) with |L| = |R| = N and then let
(X , Y) := a uniformly random edge of G.
Because an edge is chosen uniformly, the distribution of (X,Y)
• is supported on E,
• has X uniform on L , Y uniform on R iff G is regular, and
• satisfies
I(X;Y) = log₂ ( N / d ) (1)
where d is the common degree of every vertex (because P(x)=P(y)=1/N and each edge
has probability 1/(dN)).
Hence, as soon as we can produce a dense regular graph in which every d = Θ(N) (so the edge density is a constant, say 1/2), we automatically get
I(X;Y) = log₂ (N/d) = log₂ 2 = 1 (2)
– just one bit of mutual information, certainly ≤ √n for all large n.
Take d = ⌈N/2⌉ and choose a d-regular bipartite graph uniformly at random.
(Existence of such graphs for every d ≤ N is classical; see,
e.g., Alon–Spencer, The Probabilistic Method, Chap. 7.)
Let k = ⌈3 log₂ N⌉ = Θ(n).
Standard second–moment/union–bound calculations show that with
probability 1 – o(1)
G contains no K_{k,k}. (3)
Indeed, if X_{A,B} is the indicator of the event “A ⊆ L, B ⊆ R, |A| = |B| = k form a complete bipartite graph”, then
E[ X_{A,B} ] = (d/N)^{k²} = 2^{-k²} (since d = N/2).
The number of pairs (A,B) is binom(N,k)² ≤ N^{2k}.
For k = 3 log₂ N we get
E[ total K_{k,k} ] ≤ N^{2k}·2^{-k²} = 2^{2k log₂ N – k²} = 2^{-k²/3} < 1/N.
A Markov bound then gives (3).
Thus with high probability a random (N/2)-regular bipartite graph is
• 50 % dense,
• regular (hence yields uniform marginals), and
• contains no biclique larger than Θ(log N)=Θ(n) on each side.
Fix one such graph G and keep it for the rest of the discussion.
Define (X,Y) to be a uniform random edge of this G.
Then:
• X and Y are exactly uniform over {0,1}ⁿ;
• I(X;Y) = 1 bit (by (2));
• nevertheless every product set A×B contained in supp(X,Y)=E satisfies
min{|A|,|B|} ≤ Θ(log₂ N)=Θ(n). (4)
In other words, even though the two variables share only one bit of information, their joint support is so “shattered’’ that no Cartesian rectangle of size larger than ≈ n × n can be found inside it.
Taking n large, an “n × n’’ rectangle is utterly negligible compared with the 2ⁿ possible values of each variable, so the requested “large” product set does not exist.
Statement (4) is exactly the graph–theoretic counterpart:
A bipartite graph on N + N vertices with ½ N² edges (density 1/2) can be
constructed so that the largest complete bipartite subgraph has only
Θ(log N) vertices on each side.
This is the classical behaviour predicted by the Zarankiewicz problem;
random graphs are essentially extremal in this regard.
Hence:
Small mutual information (even a constant number of bits) does not
force the existence of large sets 𝓧,𝓨 with 𝓧×𝓨 ⊆ supp(X,Y);
equivalently, high edge density in a bipartite graph does
not guarantee a large biclique.
Random regular bipartite graphs give explicit counter-examples.
Was this answer helpful?
version: o3-2025-04-16
Status: UQ Validated
Validated: 8 months ago
Status: Needs Human Verification
Loading model reviews...
Loading reviews...