Postdoctoral Researcher, Graz University of Technology
Reducing the State Space while Preserving "Information"
Markov chains are attractive models for real-world random processes such as speech, natural language, chemical reactions, and the behavior of gene regulatory networks. Sometimes, though, the state space (i.e., the number of states the process can go through) is too large to admit simulation, model estimation, or model analysis - just think of the rather large number of words in an English dictionary! One way to simplify these problems is state space aggregation, i.e., clustering words that are similar in a well-defined sense. With information-theoretic cost functions used as measures of similarity, you get to the core task of information processing: Remove all information you don't need, and keep only what's relevant. Aside from solving the aggregation problem, these methods and cost functions can be used for clustering (i.e., grouping data points based on their pairwise distances/similarities) and co-clustering (i.e., grouping two sets of data points simultaneously based on their relationships).
Abstract: We consider a continuous-time Markov chain (CTMC) whose state space is partitioned into aggregates, and each aggregate is assigned a probability measure. A sufficient condition for defining a CTMC over the aggregates is presented as a variant of weak lumpability, which also characterizes that the measure over the original process can be recovered from that of the aggregated one. We show how the applicability of de-aggregation depends on the initial distribution. The application section is a major aspect of the article, where we illustrate that the stochastic rule-based models for biochemical reaction networks form an important area for usage of the tools developed in the paper. For the rule-based models, the construction of the aggregates and computation of the distribution over the aggregates are algorithmic. The techniques are exemplified in three case studies.
Pub.: 20 Mar '13, Pinned: 18 Oct '17
Abstract: We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain. This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis. The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain. We also present an algorithm listing all possible information-preserving state space reductions, for a given transition graph. We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.
Pub.: 24 Jul '13, Pinned: 26 Sep '17
Abstract: We consider a continuous-time Markov chain (CTMC) whose state space is partitioned into aggregates, and each aggregate is assigned a probability measure. A sufficient condition for defining a CTMC over the aggregates is presented as a variant of weak lumpability, which also characterizes that the measure over the original process can be recovered from that of the aggregated one. We show how the applicability of de-aggregation depends on the initial distribution. The application section is devoted to illustrate how the developed theory aids in reducing CTMC models of biochemical systems particularly in connection to protein-protein interactions. We assume that the model is written by a biologist in form of site-graph-rewrite rules. Site-graph-rewrite rules compactly express that, often, only a local context of a protein (instead of a full molecular species) needs to be in a certain configuration in order to trigger a reaction event. This observation leads to suitable aggregate Markov chains with smaller state spaces, thereby providing sufficient reduction in computational complexity. This is further exemplified in two case studies: simple unbounded polymerization and early EGFR/insulin crosstalk.
Pub.: 21 Nov '13, Pinned: 26 Sep '17
Abstract: We use results from zero-error information theory to determine the set of non-injective functions through which a Markov chain can be projected without losing information. These lumping functions can be found by clique partitioning of a graph related to the Markov chain. Lossless lumping is made possible by exploiting the (sufficiently sparse) temporal structure of the Markov chain. Eliminating edges in the transition graph of the Markov chain trades the required output alphabet size versus information loss, for which we present bounds.
Pub.: 22 Jan '16, Pinned: 26 Sep '17
Abstract: We consider the problem of reducing a first-order Markov chain on a large alphabet to a higher-order Markov chain on a small alphabet. We present information-theoretic cost functions that are related to predictability and lumpability, show relations between these cost functions, and discuss heuristics to minimize them. Our experiments suggest that the generalization to higher orders is useful for model reduction in reliability analysis and natural language processing.
Pub.: 16 Aug '16, Pinned: 26 Sep '17
Join Sparrho today to stay on top of science
Discover, organise and share research that matters to you