arroyo/arroyo-feed-cache.org

9.4 KiB
Raw Permalink Blame History

Arroyo Feed Cache Generator

this isn't being used right now, sorry.

Arroyo Feed Cache is used to generate a list of RSS and Atom URLs from across a org-roam Knowledge Base in a format which can be used by the Universal Aggregator to present a Maildir cache of the feeds' contents so that they can be viewed in Gnus. Universal Aggregator is a small but important part of my Email and News and Information Pipelines, but it's fairly simple and quite robust. I'll have documentation for operating UA itself defined elsewhere, Arroyo Feed Generator puts the ggsrc file in place.

It is managed and updated and curated in fits and bursts and when I want to expand my news horizon in some fashion. see YouTube Feeds as an example in use; automatically generate from a table, copy to server, restart UA with arroyo-feeds-flood">arroyo-feeds-flood.

Arroyo Feed Cache consumes tabular feed information from pages throughout my Knowledge Base, as well as ARROYO_FEED_URL keywords stored in the Arroyo System Cache.

Literate Programming helpers for engaging with the Feed Cache

Using this is "indirect" compared to the rest of the arroyo systems rather than running in the context of an org-tangle process and an external coordinator process like Arroyo Home Manager, CALL blocks must be used to create these files and read the table data in the context of Org Babel. Given a table with three columns, and an invocation shaped like this the source example below, CALL'ing that tangle-file function will collect the file and export it; this needs to somehow be automated! for now, eval it when the feed list changes to make sure the tangles are up to date. the tangled version of this can be fed directly in to Yasnippet, btw:

#+name: tangle-file
#+CALL: arroyo/arroyo-feed-cache.org:tangle-file[:results drawer](str=$1-ggs(), "ggs/70-$1.ggs")

#+NAME: $1-ggs
#+CALL: arroyo/arroyo-feed-cache.org:make-string(tbl=$1)

#+name: $1
| <name> | <feed> | <category> |

This cross-file CALL'ing is quite cool, those references are here and can be made absolute if Arroyo is not cloned in to org-roam-directory.

(arroyo-feeds-make-string tbl)
(arroyo-feeds-write-file str path)

Code for tabular file ggsrc fragment generation

(defun arroyo-feeds-make-string (tbl)
  "Converts an babelized org-mode tabel TBL in to an ggsrc command.

  The table is three columns, NAME, FEED_URL, CATEGORY."
  (->> tbl
       (seq-map (lambda (line)
                  (format "rss \"%s\" %s # \"%s\"\n" (second line) (third line) (first line))))
       (apply #'s-concat)))
(defun arroyo-feeds-write-file (ggsrc-block export-path)
(let* ((cce-directory (expand-file-name "cce" org-roam-directory))
       (fname (expand-file-name export-path cce-directory)))
  (with-current-buffer (find-file-noselect (expand-file-name export-path cce-directory))
    (erase-buffer)
    (insert ggsrc-block)
    (save-buffer)
    (format "[[%s][%s]]\n#+ARROYO_FEEDS: %s" fname export-path export-path))))

Code for Keyword Metadata ggsrc fragment generation

Of course Arroyo can query the System Cache for the key/value pairs on individual pages in the form:

#+ARROYO_FEED_URL: https://afd.fontkeming.fail/AFDSEW.xml
#+ARROYO_FEED_CATEGORY: News

arroyo-feedskeyword-fragment returns a string which does this:

(defun arroyo-feeds--keyword-fragment ()
  (->> (arroyo-db-get "ARROYO_FEED_URL")
       ;; add titles
       (-map (pcase-lambda (`(,file ,url))
               (let ((res (org-roam-db-query [:select title :from nodes
                                                      :where (= level 0) :and (= file $s1)]
                                             file)))
                 (list file url (car (first res))))))
       ;; add feed category
       (-map (pcase-lambda (`(,file ,url ,title))
               (let ((res (arroyo-db-get "ARROYO_FEED_CATEGORY" file)))
                 (list file url title (first res)))))
       ;; format and output
       (-map (pcase-lambda (`(,file ,url ,title ,category))
               (concat "rss \"" url "\" " category
                       " # " (format "[[%s][%s]]" file title))))
       (s-join "\n")))

Generating a ggsrc file for Universal Aggregator

(arroyo-feeds-flood) will call the export functions in each file and then insert all of the fragments in to the final ggsrc configuration file.

The Feed Cache uses Arroyo System Cache to find files which can have feeds extracted from them:

(add-to-list 'arroyo-db-keywords "ARROYO_FEEDS")
(add-to-list 'arroyo-db-keywords "ARROYO_FEED_URL")
(add-to-list 'arroyo-db-keywords "ARROYO_FEED_CATEGORY")
(defcustom arroyo-feeds-ggsrc-path "/ssh:fontkeming.fail:~/Maildir/ggsrc"
  "The location to write final collated GGSRC file. This can be local or TRAMP path."
  :group 'arroyo
  :group 'arroyo-feeds
  :type 'string)

(defun arroyo-feeds-flood (&optional no-restart)
  "Create a ggsrc file suitable for the grey goo spawner. With prefix argument skip restarting service"
  (interactive "P")
  (with-current-buffer (find-file arroyo-feeds-ggsrc-path)
    (erase-buffer)
    (->> (arroyo-db-get "ARROYO_FEEDS")
         (-sort #'arroyo-feeds--sort-pairs)     ;; (ref:sort-pairs)
         (-map (pcase-lambda (`(,file ,dest))
                 (arroyo-feeds--call-file file) ;; (ref:call-file)
                 dest))                         ;; done with file, return only dest
         (append (list "ggs/0-rss-command.ggsrc"))
         (arroyo-feeds--insert-files))          ;; (ref:insert-files)
    (goto-char (point-max))
    (insert (arroyo-feeds--keyword-fragment))   ;; (ref:keyword-fragment)
    (save-buffer))

  (unless no-restart
    (let ((default-directory "/ssh:fontkeming.fail:/home/rrix"))
      (async-shell-command "sudo systemctl restart ua && echo \"done\""))))

arroyo-feedssort-pairs is simple, except that I've gone mad with the power of pcase; it sorts the database results lexically by the path of the output file, which is generally of the form ggs/$SORT_INTEGER-name.ggsrc so that boils down to a lexical sort of the two digit sort integer. Use it or weep. Universal Aggregator supplies a stage-zero with commands defined in it and I will probably move that in to Arroyo itself soon.

(defun arroyo-feeds--sort-pairs (one two)
  (pcase-let ((`(,_ ,one-path) one)
              (`(,_ ,two-path) two))
    (s-less? one-path two-path)))

arroyo-feedscall-file is also simple, the discovery of org-sbe is fortuitous.

(defun arroyo-feeds--ugly-execute-src-block (block-name)
  (save-excursion
    (goto-char (point-min))
    (while (and (search-forward-regexp
                 (org-babel-named-src-block-regexp-for-name block-name)
                 nil t)
                (org-babel-execute-src-block)))))

(defun arroyo-feeds--call-file (path)
  (with-current-buffer (find-file-noselect path)
    (arroyo-feeds--ugly-execute-src-block "tangle-file")))

arroyo-feedsinsert-files iterates over the list of fully expanded file names in to the target file:

(defun arroyo-feeds--insert-files (files)
  (let ((cce-directory (expand-file-name "cce" org-roam-directory)))
    (dolist (fragment files)
      (goto-char (point-max))
      (insert-file-contents (expand-file-name fragment cce-directory)))))

DONE support ARROYO_FEED_URL and ARROYO_FEED_CATEGORY

  • State "DONE" from "NEXT" [2021-08-29 Sun 21:18]

NEXT deploy the file to the server (for now with shell script, eventually deploy nixos)

Universal Aggregator "top-matter"

ROOTDIR=/data/feeds
CACHE=/data/ua-cache
default_timeout=30
rss() {
    command 2000 "rss2json \"$1\" | ua-inline | maildir-put -cache ${CACHE} -redis -redis-db 1 -root ${ROOTDIR} -folder \"$2\""
}

Footmatter

(provide 'arroyo-feeds)