Friday, May 3, 2024

Property graphs: elements, labels and properties

A property graph consists of three types of "things" in it: elements, labels and properties. 

Elements are nodes or edges in the graphs. They form the basic structure of a graph. An edge connects two nodes. Two nodes may be connected by multiple edges corresponding to different relationships between them.

Labels classify the elements. An element may belong to multiple classes and thus have multiple labels.

Properties are key-value pairs providing more information about an element. All the elements with the same label expose same set of keys or properties. A property of a given element may be exposed through multiple labels associated with that element.

Let's use the diagram here to understand the concepts better. There are three elements: N1, N2, the vertexes and an edge connecting them, labels L1 to L4, properties P1 to P7. The arrows connecting a label to an property indicates that that label exposes that property. E.g. label L3 exposes properties P2, P3, P5. Property P1 is exposed by both L1 and L2. An arrow between an element and a label indicates that that label is associated with that element. N1 has labels L1 and L2 whereas the edge has just one label L4. The properties that are associated with (and are exposed by) an element are decided by the labels associated with it. E.g. the properties P1, P2 and P4, which are union of properties associated with labels L1 and L2, are exposed by element N1. P1 has the same value v1 irrespective of which label is considered for this association. E.g height of a person will not change whether that person is classified as a teacher, businessman or a plumber. Similarly notice that the edge exposes properties P6 and P7 since it is labelled as L4.

SQL/PGQ's path pattern specification language allows to specify paths in terms of labels ultimately exposing the properties of individual paths that obey that patterns. E.g. (a IS L1 | L2)-[]->(b IS L3) COLUMNS (a.P3) will returns values of property P2 of all the nodes with labels L1 or L2. If you notice that N1 and N2 are the elements associated with either L1 or L2 or both. But N1 does not expose property P3. Hence we might expect that the above query would return an error. But instead the standard specified that it should report NULL, quite inline with the spirit of SQL NULL which means unknown.


The way I see it, a property can not exist without at least one label exposing it. A label can not exist without being associated with at least an element. But once defined, they have quite an independent existence.

No comments:

Post a Comment