## Abstract

Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques.

We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well.

We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well.

Original language | English |
---|---|

Title of host publication | Proceedings of the 2019 SIAM International Conference on Data Mining |

Editors | Tanya Berger-Wolf, Nitesh Chawla |

Number of pages | 9 |

Publisher | Society for Industrial and Applied Mathematics |

Publication date | 2019 |

Pages | 729-737 |

ISBN (Electronic) | 978-1-61197-567-3 |

DOIs | |

Publication status | Published - 2019 |

MoE publication type | A4 Article in conference proceedings |

Event | SIAM International Conference on Data Mining (SDM19) - Hyatt Regency Calgary, Calgary, Canada Duration: 2 May 2019 → 4 May 2019 |

## Fields of Science

- 113 Computer and information sciences