### Abstract

Original language | English |
---|---|

Title of host publication | 2018 Data Compression Conference 27–30 March 2018 Snowbird, Utah, USA |

Editors | Ali Bilgin, Michael W. Marcellin, Joan Serra-Sagrista, James A. Storer |

Number of pages | 10 |

Publisher | IEEE |

Publication date | 2018 |

Pages | 315-324 |

ISBN (Print) | 978-1-5386-4884-1 |

ISBN (Electronic) | 978-1-5386-4883-4 |

DOIs | |

Publication status | Published - 2018 |

MoE publication type | A4 Article in conference proceedings |

Event | Data Compression Conference - Snowbird, United States Duration: 27 Mar 2018 → 30 Mar 2018 https://signalprocessingsociety.org/blog/dcc-2019-2019-data-compression-conference |

### Publication series

Name | DCC |
---|---|

ISSN (Electronic) | 2375-0359 |

### Fields of Science

- Data structures, Arbitrary constants
- Large alphabets
- Optimal time
- Predecessor problems
- rank select
- Run length
- State of the art
- succinct, Data compression
- 113 Computer and information sciences

### Cite this

*2018 Data Compression Conference 27–30 March 2018 Snowbird, Utah, USA*(pp. 315-324). (DCC). IEEE. https://doi.org/10.1109/DCC.2018.00040

}

*2018 Data Compression Conference 27–30 March 2018 Snowbird, Utah, USA.*DCC, IEEE, pp. 315-324, Data Compression Conference, Snowbird, United States, 27/03/2018. https://doi.org/10.1109/DCC.2018.00040

**Run compressed rank/select for large alphabets.** / Fuentes-Sepulveda, J.; Karkkainen, J.; Kosolobov, D.; Puglisi, S.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Run compressed rank/select for large alphabets

AU - Fuentes-Sepulveda, J.

AU - Karkkainen, J.

AU - Kosolobov, D.

AU - Puglisi, S.

PY - 2018

Y1 - 2018

N2 - Given a string of length n that is composed of r runs of letters from the alphabet 0,1,..,σ-1 such that 2 ≤ σ ≤ r, we describe a data structure that, provided r ≤ n/log ω(1) n, stores the string in rlog nσ/r + o(r log nσ/r) bits and supports select and access queries in O(log log(n/r)/loglogn) time and rank queries in O(log log(nσ/r)/log time. We show that r log n(σ-1)/r-O(log n/r) bits are necessary for any such data structure and, thus, our solution is succinct. We also describe a data structure that uses (1 + ϵ)r log nσ/r + O(r) bits, where ϵ > 0 is an arbitrary constant, with the same query times but without the restriction r ≤ n/log ω(1) n. By simple reductions to the colored predecessor problem, we show that the query times are optimal in the important case r ≥ 2logδ n, for an arbitrary constant δ > 0. We implement our solution and compare it with the state of the art, showing that the closest competitors consume 31-46% more space. © 2018 IEEE.

AB - Given a string of length n that is composed of r runs of letters from the alphabet 0,1,..,σ-1 such that 2 ≤ σ ≤ r, we describe a data structure that, provided r ≤ n/log ω(1) n, stores the string in rlog nσ/r + o(r log nσ/r) bits and supports select and access queries in O(log log(n/r)/loglogn) time and rank queries in O(log log(nσ/r)/log time. We show that r log n(σ-1)/r-O(log n/r) bits are necessary for any such data structure and, thus, our solution is succinct. We also describe a data structure that uses (1 + ϵ)r log nσ/r + O(r) bits, where ϵ > 0 is an arbitrary constant, with the same query times but without the restriction r ≤ n/log ω(1) n. By simple reductions to the colored predecessor problem, we show that the query times are optimal in the important case r ≥ 2logδ n, for an arbitrary constant δ > 0. We implement our solution and compare it with the state of the art, showing that the closest competitors consume 31-46% more space. © 2018 IEEE.

KW - Data structures, Arbitrary constants

KW - Large alphabets

KW - Optimal time

KW - Predecessor problems

KW - rank select

KW - Run length

KW - State of the art

KW - succinct, Data compression

KW - 113 Computer and information sciences

U2 - 10.1109/DCC.2018.00040

DO - 10.1109/DCC.2018.00040

M3 - Conference contribution

SN - 978-1-5386-4884-1

T3 - DCC

SP - 315

EP - 324

BT - 2018 Data Compression Conference 27–30 March 2018 Snowbird, Utah, USA

A2 - Bilgin, Ali

A2 - Marcellin, Michael W.

A2 - Serra-Sagrista, Joan

A2 - Storer, James A.

PB - IEEE

ER -