Building Blocks
At the core of Artigraph are nodes and links. Nodes are the vertices
of the graph and links are the edges that connect them. Links, the graph edges in
Artigraph, are directed (having a source_id
and a target_id
) and may have an
optional label that describes the relationship between the nodes they connect.
Nodes which have data associated with them are called artifacts. Artifacts are the primary way you'll store information in Artigraph. You can organize artifacts into models which give structure to the data they contain.
Nodes
The Node class forms the backbone of the graphs you'll create with Artigraph. It represents a single vertex in a graph. You can create a node by instantiating the class directly:
This won't immediately save the node to the database. To do that, you'll need to call write_one().
Or, if you need to write more than one node, you can pass write_many() a sequence of nodes.
Links
Link objects are the edges that connect nodes in a graph. To create one you'll need at least two node. You can then pass their IDs to the constructor:
import artigraph as ag
node1 = ag.Node()
node2 = ag.Node()
link = ag.Link(source_id=node1.graph_id, target_id=node2.graph_id)
Which will form the following relationship:
graph LR
n1([node1])
n2([node2])
n1 --> n2
Node links can also have a label that describes the relationship between the nodes:
That have a labeled edge between them:
graph LR
n1([node1])
n2([node2])
n1 -- your-label --> n2
You can then write them all to the database:
Artifacts
An Artifact is a node in the graph that additionally has a value. Without specifying anything extra, values are limited to byte strings:
To store other types of data, you'll need to declare a serializer:
You can then write the artifact to the database:
As with nodes, you can relate them with each other or with other nodes using node links:
node = ag.Node()
artifact = ag.Artifact(value=b"Hello, world!")
link = ag.Link(source_id=node.graph_id, target_id=artifact.graph_id)
ag.write([node, artifact, link])
Some data is too large to store directly in the database. For that, you can specify a storage location:
file_storage = ag.FileSystemStorage("path/to/storage/dir")
large_artifact = ag.Artifact(
value={"really": "big", "data": "here"},
serializer=ag.json_serializer,
storage=file_storage,
)
Models
A GraphModel gives structure to the data in your artifacts. The
easiest way to create one is using the built-in @dataclass
decorator, though other model types exist. The only
difference between this decorator and the standard library version is that it must be
used on a subclass of GraphModel
which requires a version (which will be discussed
later). With that in mind, you can define a model like so:
You can then create an instance of the model and write it to the database:
You'll note here that you didn't have to specify a serializer for non-bytes values.
That's because, by default, the fields of a dataclass GraphModel
are serialized using
the built-in JSON serializer. For fields that are not JSON-serializable or which require
external storage, you'll need to annotate them with this information using
typing.Annotated
:
from datetime import datetime
from typing import TypeVar, Annotated
import artigraph as ag
T = TypeVar("T")
DateTime = Annotated[datetime, ag.datetime_serializer]
StoreFile = Annotated[T, ag.FileSystemStorage("path/to/storage/dir")]
@ag.dataclass
class Person(ag.GraphModel, version=1):
name: str
age: int
birthday: DateTime
photo: StoreFile[bytes]
person = Person(
name="John Doe",
age=42,
birthday=datetime(1979, 1, 1),
photo=b"a really big photo",
)
ag.write_one(person)