Can new Edges be added to large numbers of existing Nodes by referencing a Node's Type?

@matthewmcneely @MichelDiz

Hi Mathew and Michael. Thanks for your responses.

Does this mean that Edges can be like a type wrapper for the Node to which it points? Where a Director Edge (from a Movie Node) points to a Person, thereby saying that that person is a type Director? Elsewhere in the graph another Edge from a Person Node could point to the same Person Node, that the Director Edge points to, with a label of Father. This means that the same Person can be a different type within each relationship (Movie > Director > Person) and (Person > Father > Person)?

This seems to be very different compared to my experience with Types, which comes from an Algebraic Data Type (AlgDT) system… Or is it?

A Person type that is also a Director and a Father could be expressed as a Type with Variants:

type Person
    = Director
    | Father

Depending on the function that Person is either a Director or a Father? This is awesome! Is this why Neo4J mandates that all Edges are strictly typed?

Algebraic thinking reasons about values in terms of their types and the operations that can be applied to them by expressions (functions). These types can be directly associated to a domain entities.

The AlgDT type system comes from ML Functional Programming languages like Haskell, OCAML, Rescript (ReasonML), F#, Rust, Elm and to a limited extent Typescript.

Most AlgDTs are user-made custom types, like userProfile or UserID, and are recognised by the compiler and application as semantically unique entities and are assigned the same level of significance as that given to a string, integer record and array.

Algebraic Data Types can be understood as user-constructed types made up of single or multiple values that become entities in data. This means they can be composed in all sorts of cool ways and passed from one function to another, or nested inside other unique types.

AlgDTs are assigned a much higher level of uniqueness and meaning than conventional reference types which are primitive types with a label. Here is a quote from: domain driven design - Is it still valid to speak about anemic model in the context of functional programming? - Software Engineering Stack Exchange

Suppose we need to define a type representing user IDs. An “anemic” definition would state that user IDs are strings. That’s technically feasible, but runs into huge problems because user IDs aren’t used like arbitrary strings. It makes no sense to concatenate them or slice out substrings of them, Unicode shouldn’t really matter, and they should be easily embeddable in URLs and other contexts with strict character and format limitations.

Solving this problem usually happens in a few stages. A simple first cut is to say “Well, a UserID is represented equivalently to a string, but they’re different types and you can’t use one where you expect the other.” Haskell (and some other typed functional languages) provides this feature via newtype:

newtype UserID = UserID String

This defines a UserID function which when given a String constructs a value that is treated like a UserID by the type system, but which is still just a String at runtime. Now functions can declare that they require a UserID instead of a string; using UserIDs where you previously were using strings guards against code concatenating two UserIDs together. The type system guarantees that can’t happen, no tests required.

The weakness here is that code can still take any arbitrary String like "hello" and construct a UserID from it. Further steps include creating a “smart constructor” function which when given a string checks some invariants and only returns a UserID if they’re satisfied. Then the “dumb” UserID constructor is made private so if a client wants a UserID they must use the smart constructor, thereby preventing malformed UserIDs from coming into existence.

Even further steps define the UserID data type in such a way that it’s impossible to construct one that’s malformed or “improper”, simply by definition. For instance, defining a UserID as a list of digits:

data Digit = Zero | One | Two | Three | Four | Five | Six | Seven | Eight | Nine
data UserID = UserID [Digit]

To construct a UserID a list of digits must be provided. Given this definition, it’s trivial to show that it’s impossible for a UserID to exist that can’t be represented in a URL. Defining data models like this in Haskell is often aided by advanced type system features like Data Kinds and Generalized Algebraic Data Types (GADTs), which allow the type system to define and prove more invariants about your code. When data is decoupled from behavior your data definition is the only means you have to enforce behavior.