Streaming Space Complexity of Nearly All Functions of One Variable on Frequency Vectors

Research paper by Vladimir Braverman, Stephen R. Chestnut, David P. Woodruff, Lin F. Yang

Indexed on: 27 Jan '16Published on: 27 Jan '16Published in: Computer Science - Data Structures and Algorithms


A central problem in the theory of algorithms for data streams is to determine which functions on a stream can be approximated in sublinear, and especially sub-polynomial or poly-logarithmic, space. Given a function $g$, we study the space complexity of approximating $\sum_{i=1}^n g(|f_i|)$, where $f\in\mathbb{Z}^n$ is the frequency vector of a turnstile stream. This is a generalization of the well-known frequency moments problem, and previous results apply only when $g$ is monotonic or has a special functional form. Our contribution is to give a condition such that, except for a narrow class of functions $g$, there is a space-efficient approximation algorithm for the sum if and only if $g$ satisfies the condition. The functions $g$ that we are able to characterize include all convex, concave, monotonic, polynomial, and trigonometric functions, among many others, and is the first such characterization for non-monotonic functions. Thus, for nearly all functions of one variable, we answer the open question from the celebrated paper of Alon, Matias and Szegedy (1996).