arcology-fastapi/backup.org

478 lines
15 KiB
Org Mode

* Modeling the [[id:cce/org-roam][org-roam]] schema in SQLModel
Getting these to all fit together is gonna hopefully not be too fiddly. I am happy that I don't have to coerce all these primary keys and shit, I just need to be able to read from the database and walk among ORM objects to extract an Arcology Page out of it
#+begin_src python :tangle arcology/roam.py
from typing import Optional, List
from sqlmodel import Field, Relationship, SQLModel
from arcology.parse import parse_sexp
#+end_src
** Nodes have-many Links and Nodes have-many-to-many Nodes through Links
:LOGBOOK:
- State "INPROGRESS" from "NEXT" [2021-09-21 Tue 19:35]
:END:
Setting up the links here is a bit fussy; the class key =source_ref= is the database key =source= and =dest_ref= similarly maps to =dest=; this requires some extra bits to be defined to properly de-reference these, especially when using it as the LinkModel: the link models need to know to query against =source_ref= and =dest_ref!
#+begin_src sql
CREATE TABLE links (pos NOT NULL, source NOT NULL, dest NOT NULL, type NOT NULL, properties NOT NULL,
FOREIGN KEY (source) REFERENCES nodes (id) ON DELETE CASCADE);
#+end_src
#+begin_src python :tangle arcology/roam.py
class Link(SQLModel, table=True):
__tablename__ = 'links'
pos: str
source_ref: Optional[str] = Field(primary_key=True, foreign_key="nodes.id",
sa_column_kwargs=dict(name='source'))
dest_ref: Optional[str] = Field(primary_key=True, foreign_key="nodes.id",
sa_column_kwargs=dict(name='dest'))
type: str
properties: str
source: Optional["Node"] = Relationship(back_populates="outlinks",
sa_relationship_kwargs=dict(
foreign_keys="[Link.source_ref]",
lazy="joined",
))
dest: Optional["Node"] = Relationship(back_populates="backlinks",
sa_relationship_kwargs=dict(
foreign_keys="[Link.dest_ref]",
lazy="joined",
))
#+end_src
** Node is the "root" object
:PROPERTIES:
:ID: 20211203T142533.902422
:ROAM_ALIASES: arcology.roam.Node
:END:
:LOGBOOK:
- State "INPROGRESS" from [2021-09-21 Tue 18:31]
:END:
[[id:cce/org-roam][org-roam]] v2 is node-oriented, but Arcology is file-oriented.
#+begin_src sql
CREATE TABLE nodes (id NOT NULL PRIMARY KEY, file NOT NULL, level NOT NULL, pos NOT NULL,
todo , priority , scheduled text, deadline text, title , properties , olp ,
FOREIGN KEY (file) REFERENCES files (file) ON DELETE CASCADE);
#+end_src
=file_ref= is defined strangely so that the relationship can be the reasonable one to access. Similarly, there are =source= and =dest= hackery-pokery here, as well.
#+begin_src python :tangle arcology/roam.py :mkdirp yes :noweb yes
class Node(SQLModel, table=True):
__tablename__ = 'nodes'
id: Optional[int] = Field(default=None, primary_key=True)
level: int
pos: int
title: str
todo: str
priority: str
scheduled: str
deadline: str
properties: str
olp: str
<<relationships>>
<<link-relationships>>
#+end_src
It has a bunch of relationships defined against it that map to the classes defined below:
: relationships
#+begin_src python :noweb-ref relationships
file_ref: Optional[str] = Field(foreign_key="files.file",
sa_column_kwargs={'name': 'file'})
file: List["File"] = Relationship(back_populates="nodes")
aliases: List["Alias"] = Relationship(back_populates="node")
tags: List["Tag"] = Relationship(back_populates="node")
references: List["Reference"] = Relationship(back_populates="node")
#+end_src
There are four linking relationships, and I would like to add a fifth some time (select child nodes):
- *outlinks* and *outlink_nodes* are out-bound links in that the text of the link is in the *current node*
- *backlinks* and *backlink_noes* are in-bound links which exist in other nodes and link to this one.
: link-relationships
#+begin_src python :noweb-ref link-relationships
outlinks: List["Link"] = Relationship(
back_populates="dest",
sa_relationship_kwargs=dict(
viewonly=True,
foreign_keys="[Link.dest_ref]",
primaryjoin="Node.id==Link.dest_ref",
))
outlink_nodes: List["Node"] = Relationship(
back_populates="backlink_nodes",
link_model=Link,
sa_relationship_kwargs=dict(
viewonly=True,
primaryjoin="Node.id==Link.dest_ref",
secondaryjoin="Node.id==Link.source_ref",
)
)
backlinks: List["Link"] = Relationship(
back_populates="source",
sa_relationship_kwargs=dict(
viewonly=True,
foreign_keys="[Link.source_ref]",
primaryjoin="Node.id==Link.source_ref",
))
# https://github.com/tiangolo/sqlmodel/issues/89
backlink_nodes: List["Node"] = Relationship(
back_populates="outlink_nodes",
link_model=Link,
sa_relationship_kwargs=dict(
viewonly=True,
primaryjoin="Node.id==Link.source_ref",
secondaryjoin="Node.id==Link.dest_ref",
)
)
#+end_src
This thing is load-bearing, make sure it works:
#+begin_src python :session *arcology-testing*
res = session.exec(select(roam.Node).where(roam.Node.id=='"cce/cce"'))
node = res.first()
assert set([parse_sexp(bl.dest_ref) for bl in node.backlinks])
assert set([parse_sexp(bl.source_ref) for bl in node.backlinks])
#+end_src
** Nodes have-one file
:PROPERTIES:
:ID: 20211203T142617.812313
:ROAM_ALIASES: arcology.roam.File
:END:
:LOGBOOK:
- State "INPROGRESS" from [2021-09-21 Tue 18:31]
:END:
The node at level-0 is the File Properties node.
#+begin_src sql
CREATE TABLE files (file UNIQUE PRIMARY KEY, hash NOT NULL,
atime NOT NULL, mtime NOT NULL);
#+end_src
#+begin_src python :tangle arcology/roam.py :mkdirp yes
class File(SQLModel, table=True):
__tablename__ = 'files'
file: Optional[str] = Field(default=None, primary_key=True)
hash: str
atime: str
mtime: str
nodes: List["Node"] = Relationship(back_populates="file")
keywords: List["Keyword"] = Relationship(back_populates="file")
#+end_src
** Nodes have-many Aliases
:LOGBOOK:
- State "INPROGRESS" from "NEXT" [2021-09-21 Tue 19:30]
:END:
#+begin_src sql
CREATE TABLE aliases (node_id NOT NULL, alias ,
FOREIGN KEY (node_id) REFERENCES nodes (id) ON DELETE CASCADE);
#+end_src
#+begin_src python :tangle arcology/roam.py :mkdirp yes
class Alias(SQLModel, table=True):
__tablename__ = 'aliases'
node_id: str = Field(primary_key=True, foreign_key='nodes.id')
alias: str
node: Optional["Node"] = Relationship(back_populates="aliases")
def __str__(self):
return parse_sexp(self.alias)
#+end_src
*** DONE un-wrap Alias.__str__
:LOGBOOK:
- State "DONE" from "NEXT" [2021-09-21 Tue 21:22]
:END:
** Nodes have-many Tags
:LOGBOOK:
- State "INPROGRESS" from "NEXT" [2021-09-21 Tue 19:31]
:END:
#+begin_src sql
CREATE TABLE tags (node_id NOT NULL, tag ,
FOREIGN KEY (node_id) REFERENCES nodes (id) ON DELETE CASCADE);
#+end_src
#+begin_src python :tangle arcology/roam.py :mkdirp yes
class Tag(SQLModel, table=True):
__tablename__ = 'tags'
node_id: str = Field(primary_key=True, foreign_key='nodes.id')
tag: str
node: Optional["Node"] = Relationship(back_populates="tags")
def __str__(self):
return parse_sexp(self.tag)
#+end_src
*** DONE un-wrap Tag.__str__
:LOGBOOK:
- State "DONE" from "NEXT" [2021-09-21 Tue 21:22]
:END:
** Nodes have-many References
:LOGBOOK:
- State "INPROGRESS" from "NEXT" [2021-09-21 Tue 19:32]
:END:
#+begin_src sql
CREATE TABLE refs (node_id NOT NULL, ref NOT NULL, type NOT NULL,
FOREIGN KEY (node_id) REFERENCES nodes (id) ON DELETE CASCADE);
#+end_src
#+begin_src python :tangle arcology/roam.py :mkdirp yes
class Reference(SQLModel, table=True):
__tablename__ = 'refs'
node_id: str = Field(primary_key=True, foreign_key='nodes.id')
ref: str
node: Optional["Node"] = Relationship(back_populates="references")
def __str__(self):
return parse_sexp(self.ref)
#+end_src
*** DONE Unwrap Reference.__str__
:LOGBOOK:
- State "DONE" from "NEXT" [2021-09-21 Tue 21:17]
:END:
** Attach [[id:arroyo/system-cache][arroyo-db]] for Keywords
:PROPERTIES:
:ID: 20211130T223914.685132
:END:
Keywords are simple file/key/value triples which are extracted from org-mode documents' snippets matching [[help:org-re-property]]'s regexp. Basically: =#+ARROYO_MODULE: true= in =test.org= will create a row with the following emacsql encoded/quoted strings: [ test.org, ARROYO_MODULE, true ]
#+begin_src sql
CREATE TABLE keywords (file NOT NULL, keyword NOT NULL, value NOT NULL);
#+end_src
#+begin_src python :tangle arcology/roam.py :mkdirp yes :noweb yes
class Keyword(SQLModel, table=True):
__tablename__ = 'keywords'
keyword: str
value: str = Field(str, primary_key=True)
file_ref: Optional[str] = Field(foreign_key="files.file",
sa_column_kwargs={'name': 'file'},
primary_key=True)
file: List["File"] = Relationship(back_populates="keywords")
<<dunder_init>>
def __str__(self):
return {keyword: value}
#+end_src
This data is in [[id:arroyo/arroyo][Arroyo]]'s database, rather then in Org Roam.
It's attached in the [[id:arcology/sqlmodel][Arcology SQLModel Setup below]] and thus has its own schema namespace which will be crammed in by overriding it during =__init__=. It's pretty simple, and I do hope that it works.
#+begin_src python :noweb-ref dunder_init
def __init__(self):
super(self)
self.__table__.schema = "arroyo"
#+end_src
** Arcology "just" reads the [[id:cce/org-roam][org-roam]] and [[id:arroyo/system-cache][Arroyo Systems Cache]] databases
:PROPERTIES:
:ID: arcology/sqlmodel
:ROAM_ALIASES: "Arcology SQLModel Setup"
:END:
With this, we can cram the arroyo DB alongside the org-roam DB which was harder to figure out than I thought:
#+begin_src python :tangle arcology/roam.py
from sqlmodel import create_engine
from sqlalchemy import event
org_roam_sqlite_file_name = "/home/rrix/.emacs.d/org-roam.db"
arroyo_sqlite_file_name = "/home/rrix/.emacs.d/arroyo.db"
def make_engine():
engine = create_engine('sqlite:///{path}'.format(path=org_roam_sqlite_file_name), echo=True)
@event.listens_for(engine, "connect")
def do_connect(dbapi_connection, _connection_record):
dbapi_connection.execute("attach database '{ar}' as arroyo;".format(ar=arroyo_sqlite_file_name))
return engine
engine = make_engine()
#+end_src
*** NEXT make path to DB configurable
bring in a configuration framework in this?
** Validating the DB API
here's a testing session; =C-c C-c= in here and then find your =*arcology-testing*= buffer to play around:
#+begin_src python :session *arcology-testing* :results none
from sqlmodel import select, SQLModel, Session
import arcology.roam as roam
from arcology.parse import *
engine = roam.engine
session = Session(engine)
#+end_src
* INPROGRESS higher-level =arcology.Page= class
:LOGBOOK:
- State "INPROGRESS" from [2021-09-25 Sat 18:07]
:END:
The =arcology.page.Page= (aliased as =arcology.Page= elsewhere=. it's a =SQLModel= but Arcology will only use it as a dataclass and hopefully to do some smart mapping.
#+begin_src python :tangle arcology/page.py
from sqlmodel import select, Session
from pydantic import BaseModel
from typing import List, Optional
import arcology.html as html
import arcology.roam as roam
from arcology.parse import *
import itertools
from dataclasses import dataclass
@dataclass
class Page():
file_path: str
root_node: roam.Node
title: str
nodes: Optional[List[roam.Node]]
backlinks: Optional[List[roam.Link]]
outlinks: Optional[List[roam.Link]]
html: Optional[str]
backlink_html: Optional[str]
@classmethod
def from_file(cls, path: str):
keyed_path = print_sexp(path)
with Session(roam.engine) as session:
# top "root" node has level=0
q = select(roam.Node) \
.where(roam.Node.level==0) \
.where(roam.Node.file_ref==keyed_path)
root_node = session.exec(q).first()
title = root_node.title if root_node else ""
# get all the nodes on the page, and their IDs
q = select(roam.Node) \
.where(roam.Node.file_ref==keyed_path)
nodes = session.exec(q).all()
node_ids = [node.id for node in nodes]
# collect links to all nodes on page
outlinks = next(itertools.chain([node.outlinks for node in nodes]))
# collect links *from* all nodes on page
backlinks = next(itertools.chain([node.backlinks for node in nodes]))
# (ref:backlink_instantiation)
# bundle it together
return Page(
file_path=path,
title=title,
root_node=root_node,
backlinks=backlinks,
outlinks=outlinks,
nodes=nodes,
html='',
backlink_html='',
)
def make_backlinks_org(self):
if self.backlinks is None:
return ''
def to_org(link: roam.Link): # (ref:backlinks_to_org_fn)
return \
"""
,* [[{path}][{title}]]
""".format(
path=link.source.file_ref,
title=link.source.title
)
return '\n'.join([ to_org(link) for link in self.backlinks ])
def load_html(self):
self.html = html.gen_html(self.file_path)
self.backlink_html = html.gen_html_text(self.make_backlinks_org())
#+end_src
** HTML generation and caching
A very simple wrapper around =pypandoc=; this will also need to do some pre- and post-transformations.
#+begin_src python :tangle arcology/html.py
import functools
import pypandoc
@functools.lru_cache(maxsize=128)
def gen_html(input_path: str, input_format: str = 'org'):
return pypandoc.convert_file(input_path, 'html', format='org').encode('utf-8')
@functools.lru_cache(maxsize=128)
def gen_html_text(input_text: str, input_format: str = 'org'):
return pypandoc.convert_text(input_text, 'html', format='org').encode('utf-8')
#+end_src
#+begin_src python :session *arcology-testing* :tangle test.py
import arcology.roam as roam
from arcology.page import Page
from sqlmodel import Session, select
session = Session(roam.engine)
q = select(roam.Link) \
.where(roam.Link.dest_ref.in_(['"cce/cce"']))
backlinks = session.exec(q).all()
print(backlinks[0].source.file_ref)
page = Page.from_file(path="/home/rrix/org/cce/cce.org")
page.load_html()
print(page.html)
print(page.backlink_html)
#+end_src
#+results:
** NEXT Backlink collection, generation and caching
** NEXT page HTML re-writing