Toggle Main Menu Toggle Search

Open Access padlockePrints

Matching and Rewriting Rules in Object-Oriented Databases

Lookup NU author(s): Dr Giacomo BergamiORCiD, Ollie Fox, Professor Graham MorganORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

Graph query languages such as Cypher are widely adopted to match and retrieve datain a graph representation, due to their ability to retrieve and transform information. Even thoughthe most natural way to match and transform information is through rewriting rules, those arescarcely or partially adopted in graph query languages. Their inability to do so has a major impacton the subsequent way the information is structured, as it might then appear more natural toprovide major constraints over the data representation to fix the way the information should berepresented. On the other hand, recent works are starting to move towards the opposite direction,as the provision of a truly general semistructured model (GSM) allows to both represent all theavailable data formats (Network-Based, Relational, and Semistructured) as well as support a holisticquery language expressing all major queries in such languages. In this paper, we show that theusage of GSM enables the definition of a general rewriting mechanism which can be expressedin current graph query languages only at the cost of adhering the query to the specificity of theunderlying data representation. We formalise the proposed query language in terms declarativegraph rewriting mechanisms described as a set of production rules L -> R while both providingrestriction to the characterisation of L, and extending it to support structural graph nesting operations,useful to aggregate similar information around an entry-point of interest. We further achieve ourdeclarative requirements by determining the order in which the data should be rewritten and multiplerules should be applied while ensuring the application of such updates on the GSM database ispersisted in subsequent rewriting calls. We discuss how GSM, by fully supporting index-based datarepresentation, allows for a better physical model implementation leveraging the benefits of columnardatabase storage. Preliminary benchmarks show the scalability of this proposed implementation incomparison with state-of-the-art implementations.


Publication metadata

Author(s): Bergami G, Fox OR, Morgan G

Publication type: Article

Publication status: Published

Journal: Mathematics

Year: 2024

Volume: 12

Issue: 17

Online publication date: 28/08/2024

Acceptance date: 25/08/2024

Date deposited: 28/08/2024

ISSN (electronic): 2227-7390

Publisher: MDPI AG

URL: https://doi.org/10.3390/math12172677

DOI: 10.3390/math12172677

Data Access Statement: The datasets are available at the following repositories: https://osf.io/btjqw/?view_only=f31eda86e7b04ac886734a26cd2ce43d and https://osf.io/rpu37/. The codebase associated with the implementation of the data model and query language interpretation are available on GitHub: https://github.com/datagram-db/datagram-db/releases/tag/v2.0. All the URLs were accessed on the 21 April 2024.


Altmetrics

Altmetrics provided by Altmetric


Share