Quantcast

Covering problems for partial words and for indeterminate strings

Research paper by MaximeCrochemoreab, Costas S.Iliopoulosa, TomaszKociumakac, JakubRadoszewskiac, WojciechRytterc, TomaszWaleńc

Indexed on: 02 Nov '17Published on: 01 Jun '17Published in: Theoretical Computer Science



Abstract

Indeterminate strings are a subclass of non-standard words having non-deterministic nature. In a classic string every position contains exactly one symbol—we say it is a solid symbol—while in an indeterminate string a position may contain a set of symbols (possible at this position); such sets are called non-solid symbols. The most important subclass of indeterminate strings are partial words, where each non-solid symbol is the whole alphabet; in this case non-solid symbols are also called don't care symbols. We consider the problem of finding a shortest cover of an indeterminate string, i.e., finding a shortest solid string whose occurrences cover the whole indeterminate string. We show that this classical problem becomes NP-complete for indeterminate strings and even for partial words. The proof of this fact is one of the main results of this paper. Our other main results focus on design of algorithms efficient with respect to certain parameters of the input (so called FPT algorithms) for the shortest cover problem. For the indeterminate string covering problem we obtain an O(nk2+2kk3)<math class="math"><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><mi is="true">n</mi><msup is="true"><mrow is="true"><mi is="true">k</mi></mrow><mrow is="true"><mn is="true">2</mn></mrow></msup><mo is="true">+</mo><msup is="true"><mrow is="true"><mn is="true">2</mn></mrow><mrow is="true"><mi is="true">k</mi></mrow></msup><msup is="true"><mrow is="true"><mi is="true">k</mi></mrow><mrow is="true"><mn is="true">3</mn></mrow></msup><mo stretchy="false" is="true">)</mo></math>-time algorithm, where k is the number of non-solid symbols, while for the partial word covering problem we obtain a running time of O(nk2+2O(klog⁡k))<math class="math"><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><mi is="true">n</mi><msup is="true"><mrow is="true"><mi is="true">k</mi></mrow><mrow is="true"><mn is="true">2</mn></mrow></msup><mo is="true">+</mo><msup is="true"><mrow is="true"><mn is="true">2</mn></mrow><mrow is="true"><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><msqrt is="true"><mi is="true">k</mi></msqrt><mi mathvariant="normal" is="true">log</mi><mo is="true">⁡</mo><mi is="true">k</mi><mo stretchy="false" is="true">)</mo></mrow></msup><mo stretchy="false" is="true">)</mo></math>. Additionally, we prove that, unless the Exponential Time Hypothesis is false, no 2o(k)nO(1)<math class="math"><msup is="true"><mrow is="true"><mn is="true">2</mn></mrow><mrow is="true"><mi is="true">o</mi><mo stretchy="false" is="true">(</mo><msqrt is="true"><mi is="true">k</mi></msqrt><mo stretchy="false" is="true">)</mo></mrow></msup><msup is="true"><mrow is="true"><mi is="true">n</mi></mrow><mrow is="true"><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><mn is="true">1</mn><mo stretchy="false" is="true">)</mo></mrow></msup></math>-time solution exists for either problem, which shows that our algorithm for partial words is close to optimal. We also present an algorithm for both problems parameterized both by k and the alphabet size with a simple implementation.