Testing random variables for independence and identity

Given access to independent samples of a distribution A over [n] × [m], we show how to test whether the distributions formed by projecting A to each coordinate are independent, i.e., whether A is \varepsilon-close in the L1 norm to the product distribution A1 × A2 for some distributions A1 over [n] and A2 over [m]. The sample complexity of our test is \widetilde0(n^{{2 \mathord{\left/ {\vphantom {2 3}} \right. \kern-\nulldelimiterspace} 3}} m^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}} poly(\varepsilon ^{ - 1} )), assuming without loss of generality that m \leqslant n. We also give a matching lower bound, up to poly(\log n,\varepsilon ^{ - 1} ) factors. Furthermore, given access to samples of a distribution X over [n], we show how to test if X is \varepsilon-close in L1 norm to an explicitly specified distribution Y . Our test uses \widetilde0(n^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}} poly(\varepsilon ^{ - 1} )) samples, which nearly matches the known tight bounds for the case when Y is uniform.

en

http://eprints.lse.ac.uk/31083/