Compare commits

...

2 Commits

2 changed files with 34 additions and 4 deletions

View File

@ -18,8 +18,10 @@
This package exports two public functions to be used by higher level interfaces like a Django DB or SQLModel.
- =arroyo_rs.parse_file(path: str) -> Document=
- =arroyo_rs.export_html(path: str) -> str= (This hasn't been implemented yet and the shape of it might change, for example adding DB-backed link rewriting for the Arcology)
- =arroyo_rs.parse_file(path: str) -> Document= -- [[id:20240112T120658.817314][The Parser]] and [[id:20231024T105239.712407][The Parser Document Types]]
- =arroyo_rs.htmlize_file(path: str, options: ExportOptions) -> str= -- [[id:20240112T120813.386800][The HTML exporter]]
- COMING SOON =arroyo_rs.atomize_file(path: str, **kwargs) -> str= -- construct ExportOptions from kwargs
- COMING SOON =arroyo_rs.atomize_file(path: str, **kwargs) -> str= -- [[id:20240112T121157.583809][The Atom exporter]]
* Package Definitions
@ -445,6 +447,7 @@ impl Link {
** The Parser
:PROPERTIES:
:header-args:rust: :tangle src/parse.rs :mkdirp yes
:ID: 20240112T120658.817314
:END:
This code is pretty simple, there's just a lot of it. I have a fork of [[https://code.rix.si/rrix/orgize][orgize]], the Rust org-mode parser I'm using, to make it a little bit easier to work with.
@ -827,6 +830,7 @@ I wrote some simple unit tests for this below.
** The HTML exporter
:PROPERTIES:
:header-args:rust: :tangle src/export.rs :mkdirp yes
:ID: 20240112T120813.386800
:END:
I'm not exactly looking forward to writing this lul.
@ -1078,6 +1082,27 @@ pub fn htmlize_file(path: String, options: ExportOptions) -> Result<String> {
}
#+end_src
** NEXT The Atom exporter
:PROPERTIES:
:ID: 20240112T121157.583809
:END:
The HTML parser turns an org mode document in to HTML.
The Atom exporter turns a set of org mode headings in to an Atom feed for serving update notifications to a handful, perhaps even a dozen feed reader applications like [[id:20230310T155744.804329][tt-rss]] or feedly or what have you.
For now maybe it is easier to assume that the headings are all in one file; that's how the existing [[id:arcology/atom-gen][Arcology Feed Generator]] behaves, you can turn a page in to an rss feed with an unholy abomination of lua and pandoc and xml templates. Surely something better can be designed now.
the primary tension of the arroyo library now is that it is existing solely in the realm of the arcology project's design goals, and I need to start deciding whether a design goal of this library is to support non-arcology document systems.
so the first pass of this API could take a file path, extract the feed metadata from keywords; it could construct an entire atom feed, falling back to the custom HTML exporter to fill out the feed with text content. That's probably fine, and an API that other document servers could work with.
there's a step further on, where an API takes a list of headings and feed metadata, and it parses each heading and its subheadings to HTML, *which is an API I already want to provide to document systems*. it could take arbitrary document headings provided through the public interface, and construct multi-page feeds.
or we could just clobber together a version of [[https://github.com/tanrax/RSSingle][RSSingle]]; [[id:personal_software_can_be_shitty][Personal Software Can Be Shitty]].
way out there: how do feed readers behave if the "feed" is just the linearized document with updated-at and whatnot applied to it? The feed would send the entire page with each update, but what if each heading could then be processed in to a diff or summary of changes?
** Library definition and exports for the Python library
:PROPERTIES:
:header-args:rust: :tangle src/lib.rs :mkdirp yes
@ -1116,6 +1141,11 @@ fn arroyo_rs(py: Python, m: &PyModule) -> PyResult<()> {
}
#+end_src
*** NEXT it would be cool if the =htmlize_file= call could take =**kwargs= and construct the =ExportOptions= itself.
This makes it easy to make the same interface for =atomize_file=
*** WAITING add =atomize_file= to the =pyfn='s
** Tests
I wrote Cargo tests for [[id:20231023T130916.139809][split_quoted_string]], and some simple parser tests.
@ -1413,7 +1443,6 @@ def persist_one_file(s: Session, path: str) -> models.Document | None:
:ID: 20231210T162813.557835
:END:
IDK why but org-auto-tangle is struggling with files that include types from external packages; I tried just exporting load-path to it, but that didn't work very well since they still need to be =require='d. This code adds an Emacs hook which runs before org-babel-tangle runs which modifies the =load-path= and loads the libraries required by this document. Some [[id:9e75f369-8547-407e-a94e-6728e27d7119][Computer :(]] has to occur because =org-auto-tangle='s =async= invocation doesn't let me do lexical closures or anything like that, so some backquoting to inject the local variables in to the lambda has to occur.
#+begin_src emacs-lisp

View File

@ -64,12 +64,13 @@ in {
drawingbot-premium = (dbot.override {
name = "drawingbotv3-premium";
}).overrideAttrs (old: old // {
src = /home/rrix/sync/DrawingBotV3-Premium-1.5.0-beta-all.jar;
src = /home/rrix/sync/DrawingBotV3-Premium-1.6.9-beta-linux.deb;
meta.license = lib.licenses.unfree;
});
koreader = callPackage ./pkgs/koreader.nix { unpatched = pkgs.koreader; };
homestuck = callPackage ./pkgs/homestuck-collection/default.nix {};
} // (with python3Packages; rec {
bandcamp-dl = toPythonApplication (callPackage ./pkgs/bandcamp-dl.nix {});
beetcamp = toPythonApplication (callPackage ./pkgs/beetcamp.nix {});