home

Pushing Haunt to Its Limits

December 12, 2022 ❖ Tags: writeup, programming, lisp, guile, scheme, webdev

When I started writing this article, I didn't mean to do anything more than describe a comment system I'd written in Guile. But as often happens when I write, I soon found myself disregarding that original scope and recording the history of every line of code I've written that's ever been run by a web server. I settled on allowing this to be an article about incorporating dynamic content into a Haunt site – a use-case that Haunt probably wasn't built to support, but which works surprisingly well due to Haunt configurations being ordinary Scheme programs.

A Reason to Demarcate "Dynamic" and "Static"

A comparison that's sometimes made in discussing personal websites is "dynamic" versus "static." To me, dynamic is something that requires server-side logic or rendering, and static is something that doesn't. A website built using Jekyll and hosted solely on Neocities or GitHub pages is static, at least by my definition, and a WordPress or Drupal blog is dynamic. The picture is muddied by the existence of third-party services like Disqus or utterances, but I digress. I make the distinction in this article because I have two physically isolated machines with distinct purposes: one is a static web server, and the other runs the software that's responsible for implementing the "dynamic" capabilities.

In particular, this is all firmly seated in my old man's home network setup, which he uses to host (among other things) a website, so he has a machine running Apache. The page you're viewing now is hosted on that machine, and so is my brother's website. We're able to host them all on the same machine thanks to name-based virtual hosts. On the same network is a separate Gentoo server that's running a couple of services for myself – because I don't like the idea of running experimental garbage on a server that belongs to my dad. Apache supports operation as a forward proxy, so I'm able to have dad's server mediate traffic between the WAN and my Gentoo box. Initially, I only used this for exposing my Pleroma instance to the internet, but I later added a handler for jakob.space/api/*, which hits the Guile application that this article is about.

The Beginning: An RSVP System

Earlier this year, I wanted to have some folks over to celebrate my 22nd birthday. I did what anyone in this situation would do and wrote my own RSVP system.

If you're just inviting a dozen or so people over, and don't otherwise have industrial-grade requirements, it's a simple thing to write yourself. I think the database schema does well to summarize the workings of the system:

CREATE TABLE IF NOT EXISTS events (
  id SERIAL,
  title varchar(128) NOT NULL,
  description varchar(16384) NOT NULL,
  datetime timestamp with time zone NOT NULL,
  location varchar(128) NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS invitations (
  id SERIAL,
  vanity char(12) NOT NULL,
  comments varchar(1024) NOT NULL,
  created_on timestamp with time zone default current_timestamp,
  capabilities bigint NOT NULL,
  event_id integer NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS rsvps (
  id SERIAL,
  vanity char(12) NOT NULL,
  invitation_id char(12) NOT NULL,
  event_id bigint NOT NULL,
  fullname varchar(128) NOT NULL,
  email varchar(256) NOT NULL,
  guests varchar(1024) NOT NULL,
  attending varchar(32) NOT NULL,
  PRIMARY KEY (id)
);

I wanted a way to communicate information about the event (where and when) to friends, and end up with a list of attendees so I'd know how many burgers and beers to get. The former is represented by the events table, and the latter is represented by the rsvps table.

The invitations table is a bit more interesting. Invitations are "magic links" containing a unique identifier (the value in the vanity column) for the person I'm inviting. When an RSVP is submitted, I keep track of which invitation code was used, so if someone's sharing invitation links with people I didn't invite myself, I know who to get mad at.1 Also, some people get more information than others. In this particular example, my friend was graduating on the same day, so we did a single celebration for the both of us. I wanted her to be able to see the list of attendees, but I didn't want anyone else to be able to see it, so I added a capabilities column. It's a bigint, and I use bit-flags to represent individual capabilities such as whether or not a particular invitee is able to view the guest list.

For the actual implementation of the RSVP system, I wrote some JavaScript to fetch the event info (via XMLHttpRequest), display it, generate a form, and when the user presses submit, send that off as a JSON object. It's a lot of uninteresting and unsurprising code, so I won't discuss it here, but you can find it here if you'd like to see how it works.

rsvp-screenshot.png

The JavaScript needs a server to interact with, which is where Guile comes into the picture. I chose to use the built-in (web server) module over a2 web framework like Artanis. The interface is simple to use: you call run-server on a lambda that takes a request as a parameter and returns a response. In this case, I have two endpoints, GET /api/rsvp/event-info and POST /api/rsvp, with different behavior, and I defer to some "handler" depending on which endpoint is being requested.

(use-modules (ice-9 match)
             (srfi srfi-1)
             (web request)
             (web response)
             (web server)
             (web uri))

(define (not-found request)
  "Build a (somewhat) descriptive response for a non-existent resource."
  (values (build-response #:code 404)
          (string-append "Resource not found: "
                         (uri->string (request-uri request)))))

(define (handle-api-request request body endpoint)
  "Route handler for the API server."
  (let ((method (request-method request))
        (originating-ip (assoc-ref (request-headers request) 'x-forwarded-for))
        (args (uri-query (request-uri request))))
    (if args
        (format #f "~a ~a (~a) (~a)" method endpoint args originating-ip)
        (format #f "~a ~a (~a)" method endpoint originating-ip)))
  (match (cons (request-method request) endpoint)
    ...
    (('GET "rsvp" "event-info") get-event-info)
    (('POST "rsvp") post-event-rsvp)
    ..
    (_ (lambda (. args) (not-found request)))))

(define (main-request-handler request body)
  "Server entry-point; parse `request' and defer to routing system."
  (let* ((path-encoded (uri-path (request-uri request)))
         (path (split-and-decode-uri-path path-encoded)))
    (define-values (response resp-body)
      (if (string= "api" (first path))
          (handle-api-request request body (drop path 1))
          (not-found request)))
    (values response resp-body)))

(run-server main-request-handler)

Some minutiae have been scrubbed from the above snippet, like appending a Access-Control-Allow-Origin header to the response and rate-limiting endpoints. I won't annotate every part of the RSVP system, but to give you a sense of how Guile acts as the "glue" between the JavaScript code and the Postgres tables, here's the code for post-event-rsvp:

(use-modules (json)
             (squee)
             (srfi srfi-1)
             (srfi srfi-9)
             (web request)
             (web response)
             (web uri))

(define (valid-receipt-code receipt)
  "Check database to see if `receipt'."
  (and (= (string-length receipt) (base64-length (%vanity-length)))
       (positive?
        (length
         (exec-query conn "SELECT * FROM rsvps WHERE vanity = $1"
                     (list receipt))))))

(define-record-type <rsvp-update>
  (make-rsvp-update-parameters)
  rsvp-update-parameters?
  (invitation-code rsvp-update-code      set-rsvp-update-code!)
  (name            rsvp-update-name      set-rsvp-update-name!)
  (email           rsvp-update-email     set-rsvp-update-email!)
  (attending       rsvp-update-attending set-rsvp-update-attending!)
  (guests          rsvp-update-guests    set-rsvp-update-guests!))

(define (params->rsvp-update params)
  "Parse `params', an alist, into a `<rsvp-update>'."
  (let ((res (make-rsvp-update-parameters)))
    (set-rsvp-update-code!      res (assoc-ref params "update"))
    (set-rsvp-update-name!      res (assoc-ref params "name"))
    (set-rsvp-update-email!     res (assoc-ref params "email"))
    (set-rsvp-update-attending! res (assoc-ref params "rsvp"))
    (set-rsvp-update-guests!    res (assoc-ref params "guests"))
    (if (any not
             (list (rsvp-update-code res)
                   (rsvp-update-name res)
                   (rsvp-update-email res)
                   (rsvp-update-attending res)
                   (rsvp-update-guests res)))
        #f
        res)))

(define (update-event-rsvp params)
  "Handler for updating an RSVP to an event."
  (let ((params (params->rsvp-update params)))
    (unless params
      (panic "invalid form data"))
    (unless (valid-receipt-code (rsvp-update-code params))
      (panic "invalid receipt code"))
    (exec-query conn
                "UPDATE rsvps SET fullname = $2, email = $3, attending = $4, guests = $5 WHERE vanity = $1"
                (list
                 (rsvp-update-code params)
                 (rsvp-update-name params)
                 (rsvp-update-email params)
                 (rsvp-update-attending params)
                 (rsvp-update-guests params)))
    (values '((content-type . (application/json)))
            (scm->json-string
             `((receipt . ,(rsvp-update-code params)))))))

(define (post-event-rsvp request body)
  "Entry point for RSVP create/update. We dispatch on the parameters."
  (let* ((params (json-string->scm (utf8->string body))))
    (cond ((assoc-ref params "update") (update-event-rsvp params))
          ...
          (else (panic "invalid invite/update code")))))

I'm using guile-squee to reach out to the Postgres database. It's a nice library and a great poster child for Guile's dynamic FFI interface, but it's little more than a wrapper around libpq – not a high-level interface. This is where the choice to use Guile has been a bit rough around the edges: you're on your own for a lot of pretty common tasks in web development world, like generating database queries or validating that a request is well-formed. And that's precisely what we're doing here. When handle-api-request calls out to post-event-rsvp, we figure out whether we're updating a previously submitted RSVP or submitting a new one (the code for that case has been omitted in the interest of brevity). In update-event-rsvp, I have to ensure the parameters are well-formed, parse them into a record, and then interpolate those into a SQL query that I wrote myself.

At the point this code was written, I hadn't written a line of Guile in about a year, and I was thinking of this more as throw-away code rather than something I'd be writing a blog post about. I wouldn't hesitate to describe it as especially ugly. The comment system, being newer, does a marginally better job of showing off the ways that Scheme allows you to be clever and avoid boilerplate:

(use-modules (json)
             (squee)
             (srfi srfi-1)
             (srfi srfi-19)
             (srfi srfi-26)
             (web request)
             (web response)
             (web uri))

(define-json-mapping <internal-comment>
  make-internal-comment
  internal-comment?
  json->internal-comment <=> internal-comment->json
  (id        internal-comment-id)
  (name      internal-comment-name)
  (subject   internal-comment-subject)
  (email     internal-comment-email)
  (comment   internal-comment-comment)
  (url       internal-comment-url)
  (publish-time
   internal-comment-publish-time
   "publish-time"
   (lambda (x) (string->date x "~Y~m~d ~H~M~S.~N"))
   (lambda (x) (date->string x "~Y-~m-~d ~H:~M:~S.~N")))
  (reactions internal-comment-reactions))

(define (get-comments-by-slug slug)
  "Internal function for querying the approved comments on a post

This interface exists for dynamically generating the comment view from Haunt."
  (define (make-internal-comment~ . args)
    (let* ((approved (first (take-right args 2)))
           (approved (string->date approved "~Y~m~d ~H~M~S.~N"))
           (reactions (last args))
           (reactions (if reactions
                          (with-input-from-string reactions read)
                          '())))
      (apply make-internal-comment
             (append (drop-right args 2) (list approved reactions)))))
  (let* ((query "SELECT id, name, subject, email, comment, url, approved, reactions
                 FROM comments WHERE slug = $1 and approved IS NOT NULL")
         (result (exec-query conn query (list slug))))
    (map (cut apply make-internal-comment~ <>) result)))

(define (get-comments request body)
  "API endpoint handler for querying for the comments on a particular post

This is a wrapper around `get-comments-by-slug'."
  (define (normalize-record record)
    (json-string->scm (internal-comment->json record)))
  (let* ((query-string (uri-query (request-uri request)))
         (params (if query-string
                     (decode-form query-string)
                     '()))
         (slug (assoc-ref params "p")))
    (unless slug (panic "missing `slug' query parameter"))
    (values '((content-type . (application/json)))
            (scm->json-string
             (list->vector
              (map normalize-record (get-comments-by-slug (car slug))))))))

Here, leveraging the fact that the result of exec-query is "close enough" to the parameters we would want to pass to the record constructor, and deferring to apply. Of course, this example has warts as well. I'm having to call normalize-record because, even though I've defined a JSON mapping for <internal-comment>, scm->json-string doesn't know how serialize the record – we can only serialize it if we call internal-comment->json. So, for every record, I serialize it into a JSON object and then immediately deserialize it into an alist – effectively to erase the type information and yield something that scm->json-string knows how to deal with. I have to call scm->json-string in this case because I'm dealing with a list of records.

There might be a better way to do this, but I haven't figured it out yet! And I think that's a fair summary of my experience writing this in Guile. Unlike other languages that see a lot of use in the web development world, in Scheme, there isn't a clear-cut "best way" or "best library" to do these sorts of things. It provides you with all of the tools you'd need to do things in a way that's beautiful and easily-understood, but if you're in a rush, and too lazy to sit down and generalize your problem, the code that you end up writing can be a bit hard on the eyes.

Had I not chosen Guile, the language I would have reached for is Rust, which has a nice ORM library (Diesel) for interacting with databases, and most Rust web frameworks will take care of parsing form data into a struct (and rejecting if the request is ill-formed.) There, you're afforded similar facilities for writing your own abstractions, but the community has given you some enough cookie cutters that you don't have to think about the basic things if all you want to do is write a web application.

One last thing I'd like to mention: this isn't entirely specific to Scheme. Some things I complained about above, like the dance I had to do with serializing JSON, might be easier to deal with if I were using a statically-typed language, but I'm certain the rest of it comes down to Guile having a smaller community. In other words: I don't think this is the fault of Scheme or Guile. With that, I hope I've nerd sniped someone into making a great new web framework for Guile.

But I think that's all there is to say about the RSVP system. It was an interesting-enough proof-of-concept for me to experiment with running more things on the server-side.

The Sequel: A Picture Gallery

Despite the rough edges in the code outlined above, from the perspective of my non-technical friends, the RSVP system worked flawlessly and I was able to throw a huge party. We ate a ton of food and drank a lot of beer. My friend Aaron was also nice enough to take the DSLR from me and take some pictures! I wanted to put them somewhere for everyone who went to be able to see, so I put them on my website and used the "magic link" approach again.

CREATE TABLE IF NOT EXISTS galleries (
  id SERIAL,
  vanity char(12) NOT NULL,
  title varchar(128),
  description varchar(4096) NOT NULL,
  datetime timestamp with time zone NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS images (
  id SERIAL,
  vanity char(12) NOT NULL,
  title varchar(128),
  filename varchar(64) NOT NULL,
  thumb_filename varchar(64) NOT NULL,
  datetime timestamp with time zone NOT NULL,
  PRIMARY KEY (id)
);

The "gallery" system has the same architecture as the "RSVP" system: there's some JavaScript to fetch info about the gallery and render it client-side, and there's some Guile code running on my server for the JavaScript to interact with. The only part that was different is that now I was dealing with images.

(define (image-exists? file-name)
  (define (string/= a b) (not (string= a b)))
  (and (string/= file-name ".")
       (string/= file-name "..")
       (member file-name (scandir (%gallery-image-directory)))))

(define (read-image file-name)
  (let* ((ext (string-downcase (last (string-split file-name #\.))))
         (mime (cond ((string= ext "jpg") 'image/jpeg)
                     ((string= ext "png") 'image/png)
                     (else (error "Unknown MIME type.")))))
    (values `((content-type . (,mime)))
            (call-with-input-file (format #f "~a/~a" (%gallery-image-directory) file-name)
              (lambda (port)
                (get-bytevector-all port))))))

(define (get-image request body)
  (let* ((query-string (uri-query (request-uri request)))
         (params (if query-string
                     (decode-form query-string)
                     '()))
         (file-name (car (assoc-ref params "name"))))
    (unless (image-exists? file-name) (panic "invalid filename"))
    (read-image file-name)))

I have a legitimate complaint about the (web server) module – I cannot, for the life of me, figure out how to respond with a binary payload without reading the entire blob into memory first. This is a problem, because the photos coming off of the DSLR are massive and my server process was literally OOM'ing. Unable to resolve it in Guile, I eventually gave up and used Rust for the "hosting the images" part.

use ascii::AsciiString;
use std::fs;
use std::path::Path;

extern crate ascii;
extern crate tiny_http;

const BASE_DIR: &'static str = "/opt/gallery-images/";

fn get_content_type(path: &Path) -> &'static str {
    let extension = match path.extension() {
        None => return "text/plain",
        Some(e) => e,
    };

    match extension.to_ascii_lowercase().to_str().unwrap() {
        "gif" => "image/gif",
        "jpg" => "image/jpeg",
        "jpeg" => "image/jpeg",
        "png" => "image/png",
        "pdf" => "application/pdf",
        "htm" => "text/html; charset=utf8",
        "html" => "text/html; charset=utf8",
        "txt" => "text/plain; charset=utf8",
        _ => "text/plain; charset=utf8",
    }
}

fn main() {
    let server = tiny_http::Server::http("0.0.0.0:8069").unwrap();

    loop {
        let rq = match server.recv() {
            Ok(rq) => rq,
            Err(_) => break,
        };

        println!("{:?}: {:?}", chrono::offset::Local::now(), rq);

        let url = rq.url().to_string();
        let path = Path::new(BASE_DIR);
        let append = Path::new(&url);

        if let Ok(stripped) = append.strip_prefix("/static-ext/") {
            let path = path.join(stripped);
            if !path
                .canonicalize()
                .map(|x| x.starts_with(BASE_DIR))
                .unwrap_or(false)
                || path.is_dir()
            {
                let rep = tiny_http::Response::new_empty(tiny_http::StatusCode(404));
                let _ = rq.respond(rep);
            } else {
                let file = fs::File::open(&path);

                if file.is_ok() {
                    let response = tiny_http::Response::from_file(file.unwrap());

                    let response = response.with_header(tiny_http::Header {
                        field: "Content-Type".parse().unwrap(),
                        value: AsciiString::from_ascii(get_content_type(&path)).unwrap(),
                    });

                    let _ = rq.respond(response);
                } else {
                    let rep = tiny_http::Response::new_empty(tiny_http::StatusCode(404));
                    let _ = rq.respond(rep);
                }
            }
        } else {
            let rep = tiny_http::Response::new_empty(tiny_http::StatusCode(404));
            let _ = rq.respond(rep);
        }
    }
}

I added a second VirtualHost for a /static-ext/ path, which hits a process running that Rust snippet instead of the main Guile program. It may have been more sensible to just store the images on dad's server, since they're static content, but I've got a much bigger disk attached to my machine and didn't want to fill up his with a couple hundred DSLR photos.

That's all. The gallery was a simple extension of the code that I'd already written for keeping track of RSVPs. The whole file is under a hundred LOC.

Comments

That's a lot of words about two systems you're unlikely to ever interact with unless you're one of the half-dozen or so people I still hang out with. The more prominent change for regular readers is the re-introduction of a comment system. There have been comment systems on jakob.space (and its predecessors) in the past, but they predate the first commit in the blog repository, so I'm devoting a few sections to talking about them; the lessons learned (however few) from previous iterations are responsible for some of the decisions made in the current implementation.

The Previous Self-Hosted Comment System

Before Haunt, I used Hugo. And before that, I wasn't using a static site generator at all. From 2015 to 2018, my personal website was running on top of Flask and SQLite. It used to look like this.

Of course, I didn't have an especially compelling reason for my website to be running on Python. This was when I was picking up Python for a second time, and I heard that "building a website with Python" was an option. Using Flask was merely my attempt to understand what that meant. What I developed was a basic content management system: I'd write my content in HTML, commit it to the database by copy/pasting it into an SQLite GUI, and use Jinja2 to shoehorn that into some hand-crafted HTML templates.

This workflow is frankly better suited to a static site generator, which is why I eventually dropped my Flask codebase for Hugo. But picking up Hugo wasn't my first response to realizing that my choice of tech stack was overkill – what I did, instead, was take advantage of the power afforded to me. I developed a comment system.

I tried my damned hardest to find the code for it, but when I pulled the repository from a backup, I was reminded of how bad my git hygiene used to be.

commit 0ca36ee37903be95f915bddcd620f5e2786a67f7 (HEAD -> master)
Author: jakob <[redacted]>
Date:   Fri Oct 27 17:12:54 2017 -0400

    Last commit before redesign

commit 914ca5965e77bfce2642fed7cc2d14ab265cc714
Author: jakob <[redacted]>
Date:   Sat Jul 22 20:25:33 2017 -0400

    Redid showcases and blog format

commit 39fa9c5760617931eae73bc9a57e9f1d60945024
Author: jakob <[redacted]>
Date:   Sun Dec 4 16:57:42 2016 -0500

    Initial commit.

I'm convinced it's lost to time. What I do remember is that I wrote a crappy captcha system for it with Pillow; any form you put on the internet is inevitably going to get attention from many kinds of web spiders, so I did the bare minimum for taping it off. The algorithm was:

  • Pick n random characters
  • Write each character to a fixed-size canvas at a fixed x offset and a randomized y offset
  • Add some "noise" by drawing nonsensical lines across the canvas

Intuition tells me that this isn't a great approach to thwarting bots, but I can't really conclude that it was ineffective because I have no data: in the time that it was actively deployed, no one had commented on any of my posts. Not even spammers. I don't recall exactly why I discontinued the comment system, but I know that it happened prior to my switch to Hugo – the first post I made that received significant traffic3 was made before the switch to Hugo, and if I had kept it to the EOL of the Flask codebase, I think I would have remembered receiving a comment or two.4

Webmentions

There was no immediate replacement to the original comment system. For a while, my website was just the post archive and "about me" page. I eventually built a spiritual successor in my attempt at a Webmention integration. Though, this is functionally quite different: rather than being stored in a central database that I manage, Webmentions are distributed across web.

Generally speaking, Webmention is "just" a protocol for indicating to a website that you've linked to it, or otherwise mentioned it elsewhere. There tends to be some metadata associated with the content of the "mention," such as the name and website of the author, and what kind of interaction it is (a response, a bookmark, etc.) Furthermore, there are services like webmention.io to handle the protocol on your behalf, making it a fairly enticing option for a static website.

One could draw comparison to other services like Disqus which I am averse to as they require the user to load non-free JavaScript. Webmention, on the other hand, is an open standard, and services that interface with the protocol typically provide a readily-queried API. In the case of webmention.io, the service itself is free software.

I first implemented Webmention support circa 2019 over a weekend when I was taking a break at my parents' summertime cottage. The protocol was getting some attention on lobste.rs and HackerNews at the time, so I figured it would be a fun project. I was able to figure out just about everything I needed to know from Aaron Parecki's article. If you dig through that page enough, you'll find the first (and only?) Webmention reply-to that I've sent.

What I won't talk about here is the code I wrote to have Haunt generate a Webmention "outbox" for comments I wrote, because I don't think it's particularly interesting. If you're really curious, the source code is here.

I'd prefer to talk about being a consumer of Webmentions. My first attempt at an "integration" was this:

/*
 * webmention.js -- Fetch and display mentions from webmention.io.
 * Copyright © 2019 Jakob L. Kreuze [REDACTED]
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License as
 * published by the Free Software Foundation; either version 3 of the
 * License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see
 * <http://www.gnu.org/licenses/>.
 */

function buildApiUri() {
  const snip = /https?:\/\/.*?\//.exec(window.location.href);
  const page = window.location.href.substring(snip[0].length);

  // TODO: Build up from aliases, etc.
  const aliases = [page];

  return Array.concat(
    ["https://webmention.io/api/mentions.jf2?"],
    aliases.flatMap((alias) => {
      const ref = "jakob.space";
      return [
        `target[]=http://${ref}/${alias}`,
        `target[]=https://${ref}/${alias}`
      ];
    }).join("&")
  ).join("");
}

function getData(url, callback) {
  if (fetch) {
    fetch(url).then(function(response) {
      if (response.status >= 200 && response.status < 300) {
        return Promise.resolve(response);
      } else {
        return Promise.reject(new Error(`Request failed: ${response.statusText}`));
      }
    }).then(function(response) {
      return response.json();
    }).then(callback);
  } else {
    let xhr = new XMLHttpRequest();
    xhr.onload = function(data) {
      callback(JSON.parse(data));
    }
    xhr.onerror = function(error) {
      throw new Error(`Request failed: ${error}`);
    }
  }
}

function makeComment(comment) {
  const strip = (uri) => {
    const sep = "://";
    return uri.substring(uri.indexOf(sep) + sep.length);
  };

  const element = (type, attributes) => {
    let res = document.createElement(type);
    for (const attribute in attributes) {
      res.setAttribute(attribute, attributes[attribute]);
    }
    return res;
  };

  // Create the h-card section.
  let avatar = element("img", {
    "class": "u-photo",
    "src": comment.author.photo
  });

  let authorName = element("a", {
    "class": "p-name u-url",
    "href": comment.author.url
  });
  authorName.innerHTML = comment.author.name;

  let authorURI = element("a", {
    "class": "author_url",
    "href": comment.author.url
  });
  authorURI.innerHTML = strip(comment.author.url);

  let hCardContainer = element("div", {
    "class": "p-author h-card author"
  });
  hCardContainer.appendChild(avatar);
  hCardContainer.appendChild(authorName);
  hCardContainer.appendChild(authorURI);

  // Create the content section.
  let contentContainer = element("div", {
    "class": "e-content p-name comment-content",
  });
  contentContainer.innerHTML = comment.content.text;

  // Create the metaline section.
  let time = element("time", {
    "class": "dt-published",
    "datetime": comment.published
  });
  time.innerHTML = (new Date(comment.published)).toString();

  let linkBack = element("a", {
    "class": "u-url",
    "href": comment.url
  });
  linkBack.appendChild(time);

  let metalineContainer = element("div", {
    "class": "metaline"
  });
  metalineContainer.appendChild(linkBack);

  // Put it all together.
  let wrapper = element("li", {
    "class": "p-comment h-cite comment",
  });
  wrapper.appendChild(hCardContainer);
  wrapper.appendChild(contentContainer);
  wrapper.appendChild(metalineContainer);

  return wrapper;
}

getData(buildApiUri(), (json) => {
  json.children
    .map(makeComment)
    .forEach((elem) => {
      document.getElementById("webmention-container").appendChild(elem);
    })});

This isn't anything to write home about. Vanilla JavaScript with no framework – the majority of the code is DOM manipulation. You're free to use it under the constraints of the license, if you really want to.

Given that so much is offloaded to the third-party service, having the little Webmention display at the bottom of my posts hasn't demanded much maintenance. For the most part, it was "set it and forget it." Though, I did come back to the script every once in a while to hack on it. My biggest issue, running this for a couple of years, was the noise. The wide variety of interaction types that can be represented as Webmentions means that only a fraction of pingbacks contribute to a dialog. Lots end up being likes or reposts or favorites, so I eventually made changes to separate the chaff from the wheat, showing those sorts of interactions as attenuated compared to the ones that carry meaningful information.

old-webmention-screenshot.png

This was certainly an improvement, but I was still unhappy with the signal-to-noise ratio, so I'm now hiding these sorts of interactions entirely. As nice as it is to be able to show off how popular you are, if it doesn't meaningfully contribute to the experience of the reader, I'd rather not clutter the page with it.

Another issue is that writing a Webmention is likely untenable for a lot of people who would be reading my website5, so I ended up leveraging another external service, commentpara.de, which hosts anonymous contents that then get syndicated via Webmention. I added a link to it above the Webmention section of every post, not expecting anyone to use it. But I turned out to be wrong about that. I've gotten some great comments through it, and that's been part of my motivation to work on the comment system I'm writing about below.

The last thing I'd like to say regarding Webmention support is that the JavaScript code described above has since been superseded. In preparation for the new comment system, I rewrote the rendering code in Guile so that the Webmention view could be generated statically – partly to reduce the load I was presumably putting on webmention.io, and partly so that folks who have JavaScript disabled could still see it.

A New Comment System

And so, motivated by a desire for more control, I started on a month-long side project to implement a comment system in Guile.

High-Level Architecture

Assuming that JavaScript is enabled was fine for something that'd only be used by my "real life" friends, but I'd like the comment system to be usable by folks who might have it disabled. Also, it was literally slow enough to warrant a loading animation. So for the comment system, I've tried to cut client-side rendering out of the picture as much as possible. Haunt now generates a comment form at the bottom of each article which is hidden by default. If JavaScript is enabled, it's shown to the user along with a button to fetch a captcha challenge using XMLHttpRequest. Otherwise, it's replaced by a link, leading to a server-side rendered page with the captcha challenge "baked in."

Why? Because, as we'll see, the captcha generation procedure is somewhat expensive. Considering the ratio of of "reader" traffic to "commenter" traffic, I didn't want to have to generate one on every page load.

web-server-traffic.jpg

Unlike the forms in the systems I described previously, there's no JavaScript involved in sending a comment to the API server. It's a regular HTML form that POSTs some application/x-www-form-urlencoded data on submission. The Guile API server receives that and inserts it into the Postgres database.

CREATE TABLE comments(
    id SERIAL PRIMARY KEY,
    approved TIMESTAMP,
    submitted TIMESTAMP NOT NULL,
    slug VARCHAR(100) NOT NULL,
    name VARCHAR(50) NOT NULL,
    subject VARCHAR(100),
    email VARCHAR(100),
    url VARCHAR(100),
    comment VARCHAR(1024) NOT NULL,
    reactions VARCHAR(1024)
);

Comments are rendered by Haunt – the builder queries the Postgres database at build-time and generates a list of alists.

(define (format-comment comment)
  "Format `comment', an alist, as SXML for a comment-type interaction"
  (define (strip uri)
    "Attempt to remove any sort of protocol specification from `uri'"
    (let* ((needle "://")
           (index (string-contains uri needle)))
      (if index
          (strip (substring uri (+ index (string-length needle))))
          uri)))
  (let* ((author-name (assoc-ref comment 'name))
         (author-url (assoc-ref comment 'url))
         (author-photo (gravatar-url (assoc-ref comment 'email)))
         (publish-datetime (assoc-ref comment 'publish-time))
         (content-text (assoc-ref comment 'comment))
         (content-reactions (assoc-ref comment 'reactions)))
    `(li (@ (class "p-comment h-cite comment comment-source-internal"))
         (img (@ (class "comment-source-identifier")
                 (alt "Icon for comments posted on jakob.space")
                 (src "/static/image/lambda.svg")))
         (div (@ (class "p-author h-card author"))
              (img (@ (class "u-photo") (src ,author-photo)))
              (span (@ (class author-name)) ,author-name)
              ,@(if author-url
                    `((a (@ (class "author-url")
                            (href ,author-url))
                         "(" ,(strip author-url) ")"))
                    `()))
         (div (@ (class "e-content p-name comment-content"))
              ,@(chain content-text
                       (safe-markdown->sxml _)))
         (div (@ (class "metaline"))
              (time (@ (class "dt-published")
                       (datetime ,publish-datetime))
                    ,(date->string
                      (string->date publish-datetime "~Y~m~d ~H~M~S.~N")
                      "~B ~e, ~Y at ~H:~M")))
         (ul (@ (class "comment-reactions"))
             ,@(map (match-lambda
                      ((emote . count)
                       `(li ,(format #f "~a (~a)" emote count))))
                    content-reactions)))))

(define (render-comment-view response)
  "Render `response', the output of `fetch-webmentions', as SXML"
  (map format-comment response))

Doing this at build-time has an obvious limitation, but I find it's a feature in my case: new comments don't appear until the site is re-built. In this comment system, though, individual comments aren't visible unless I've manually approved them (i.e, approved field is non-null), so I just re-build the site whenever I approve a comment.

I anticipate a relatively low-volume of usage, so manually approving comments should be feasible, and I trust myself to act in good faith with moderating comments. I basically just don't want spam or hate speech.

As it turns out, I'm not the first to take this approach for implementing comments on a static site. The main difference between Derek's implementation and mine is that, because I'm approving comments manually, there isn't a need for the PostgreSQL trigger. Also, I rebuild the entire website instead of using XHR or a server-side include to update the comments view.

Avatars

I'm taking the usual approach for the little avatar pictures that appear next to each comment: if you provide an email address when submitting a comment, my site will prod the Gravatar API for whatever icon you may have linked to it.

(define (gravatar-url email)
  (chain email
         (string-downcase _)
         (string-trim-both _)
         (string->bytevector _ "utf8")
         (bytevector-hash _ (lookup-hash-algorithm 'md5))
         (bytevector->base16-string _)
         (format #f "https://www.gravatar.com/avatar/~a" _)))

Otherwise, you get one of a couple fallback icons that I generated with Stable Diffusion. The prompt was something like "avatar for anonymous user." I don't remember what it was exactly since I generated them with my friend's beefy gaming computer after we'd had a few drinks.

Markdown

I wanted to support rich text in comments, so I pulled in guile-commonmark. I'm filtering out images because that's ripe for abuse – more so than links, in my experience.

(define (safe-markdown->sxml text)
  "Convert TEXT to an sxml form filtering out any unsafe entities"
  (define (sanitize sexp)
    (cond ((and (list? sexp)
                (positive? (length sexp))
                (eqv? 'img (car sexp)))
           #f)
          ((list? sexp)
           (filter identity (map sanitize sexp)))
          (else sexp)))
  (sanitize (commonmark->sxml text)))

guile-commonmark is packaged in my Gentoo overlay since it isn't in the upstream Gentoo repos.

Captcha System

In some sense, I was motivated to work on the comment system because I thought that implementing a captcha system would be fun.

For the first half of this year, I was still living at home with my parents, and my brother was still in high school. I was teaching him the AP calculus curriculum that he wasn't learning in school (mainly series stuff), and I was helping him to get good at calculating definite integrals, which I realized was mechanical enough to make for a good captcha system. Having just completed the symbolic differentiation exercises in SICP, I decided to hack one together as a proof-of-concept, which I shelved since I had to focus on more important things.

This is one of those ideas that I know isn't original, but I can't exactly remember where I heard about it. Years ago, maybe, I read somewhere about an engineering forum that required users to calculate some sort of calculus problem before they could register for an account. I know that the DoomWorld forums at some point had a topical captcha problem where you had to provide the name of a randomly-chosen Doom sprite to register an account. Those are the two likely influences I had in coming up with this.

The algorithm generates a random function and evaluates it on some randomly-chosen boundary. To generate the challenge, it symbolically differentiates the function, renders it with LaTeX, and sends that to the user. If their answer is "close enough," then they're allowed to post.

(define (random-term)
  (match (random 5)
    (0 `(* ,(+ 1 (random 10)) x))
    (1 `(* ,(+ 1 (random 10)) (expt x ,(random 10))))
    (2 `(* ,(+ 1 (random 10)) (exp x)))
    (3 `(* ,(+ 1 (random 10)) (cos x)))
    (4 `(* ,(+ 1 (random 10)) (sin x)))))

(define (sexp->latex sexp)
  (match sexp
    (('+ rest ...) (string-join (map sexp->latex rest) " + "))
    (('* rest ...) (string-join (map sexp->latex rest) " \\cdot "))
    (('sin term) (format #f "\\sin(~a)" (sexp->latex term)))
    (('cos term) (format #f "\\cos(~a)" (sexp->latex term)))
    (('expt term n) (format #f "~a^{~a}" (sexp->latex term) (sexp->latex n)))
    (('exp term) (format #f "e^{~a}" (sexp->latex term)))
    ('x "x")
    (n (cond ((and (number? n) (positive? n)) (format #f "~a" n))
             ((and (number? n) (negative? n)) (format #f "(~a)" n))
             ((number? n) "0")
             (else (error "Do not know how to convert to latex." n))))))

(define (differentiate-sexp sexp)
  (match sexp
    (('+ rest ...) `(+ ,@(map differentiate-sexp rest)))
    (('* coeff term) (if (number? coeff)
                         `(* ,coeff ,(differentiate-sexp term))
                         (error "Do not know how to differentiate.")))
    (('sin term) `(* ,(differentiate-sexp term) (cos ,term)))
    (('cos term) `(* -1 ,(differentiate-sexp term) (sin ,term)))
    (('exp term) `(* ,(differentiate-sexp term) (exp ,term)))
    (('expt term n) `(* ,n (expt ,term ,(- n 1))))
    ('x 1)
    (n (if (number? n)
           0
           (error "Do not know how to differentiate." n)))))

(define (simplify-sexp sexp)
  (match sexp
    (('+ rest ...) `(+ ,@(map simplify-sexp rest)))
    (('* 1 term) (simplify-sexp term))
    (('* 1 rest ...) (simplify-sexp `(* ,@rest)))
    (('sin term) `(sin ,(simplify-sexp term)))
    (('sin term) `(cos ,(simplify-sexp term)))
    (('exp term) `(exp ,(simplify-sexp term)))
    (('expt term 1) (simplify-sexp term))
    (('expt term n) `(expt ,(simplify-sexp term) ,(simplify-sexp n)))
    (term term)))

(define (random-expression)
  (let ((n-terms (+ 2 (random 3))))
    `(+ ,@(map (lambda (x) (random-term)) (iota n-terms)))))

(define (new-captcha!)
  (let* ((lower-bound (random 10))
         (upper-bound (+ lower-bound 1 (random 9)))
         (expression (random-expression))
         (latex-src (sexp->latex (simplify-sexp (differentiate-sexp expression))))
         (solution (- (local-eval expression (let ((x upper-bound)) (the-environment)))
                      (local-eval expression (let ((x lower-bound)) (the-environment)))))
         (id (dequeue-id! tex-challenge-id-queue)))
    (hash-set! tex-challenges id solution)
    (values id
            (latex->image (format #f "\\int_{~a}^{~a} ~a \\, dx"
                                  lower-bound
                                  upper-bound
                                  latex-src)))))

(define (validate-captcha! user-answer id)
  (define epsilon 0.01)
  (let ((solution (hash-ref tex-challenges id))
        (id-allocated (not (member id (id-queue-free tex-challenge-id-queue)))))
    ;; FIXME: The predictable IDs means that its' easy for someone to screw with
    ;; someone elses' captcha challenge (by invalidating it before they can
    ;; submit it). Given the combination of our reaping algorithm and
    ;; rate-limiting, does it make sense to only release the ID when the
    ;; response is correct?
    (format #t "Here: Solution was ~a and id-allocated is ~a~%" solution id-allocated)
    (when (and solution id-allocated)
      (release-id! id tex-challenge-id-queue))
    (and solution
         id-allocated
         (<= (/ (abs (- solution (string->number user-answer)))
                solution)
             epsilon))))

I was originally checking to see that the distance between the user's solution and the correct solution was less than epsilon, but a mathematically-inclined friend suggested using relative distance, which I've found to be much more reliable.

Of course, I might have someone who wants to comment, but is visually impaired! So I have an alternative captcha system based loosely on the "proof-of-work" system that's used by Bitcoin.

;; How many zeroes the SHA-256 hash has to be prefixed by to be a valid proof of work.
(define %hardness 8)

(define (new-proof-of-work-challenge!)
  (let ((id (dequeue-id! pow-challenge-id-queue))
        (challenge (base64-encode (gen-random-bv 32))))
    (hash-set! pow-challenges id challenge)
    (values id challenge)))

(define (validate-proof-of-work! prefix challenge-id)
  (define zero-prefix (string-join (map (lambda (_) "0") (iota %hardness)) ""))
  (when (member challenge-id (id-queue-free pow-challenge-id-queue))
    (panic "No such challenge ID"))
  (let* ((challenge (hash-ref pow-challenges challenge-id))
         (hash-value (chain (list prefix challenge)
                            (string-concatenate _)
                            (string->bytevector _ "utf8")
                            (bytevector-hash _ (lookup-hash-algorithm 'sha256))
                            (bytevector->base16-string _))))
    ;; Invariant from `unless' form:
    ;; (not (member challenge-id (id-queue-free pow-challenge-id-queue)))
    (when challenge
      (release-id! challenge-id pow-challenge-id-queue))
    (and (= 32 (string-length prefix))
         (string-prefix? zero-prefix hash-value))))

(define (make-pow-challenge! request body)
  "API endpoint handler for requesting a proof-of-work challenge"
  (let-values (((challenge-id nonce) (new-proof-of-work-challenge!)))
    (values '((content-type . (application/json)))
            (scm->json-string
             `((hardness . ,%hardness)
               (challenge-id . ,challenge-id)
               (nonce . ,nonce))))))

Regardless of whether you're using the client-side rendered form or the server-side rendered form, this requires JavaScript.

function makeid(length) {
  let result = '';
  let alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
  for (let i = 0; i < length; i++) {
    result += alphabet.charAt(Math.floor(Math.random() * alphabet.length));
  }
  return result;
}

async function digestMessage(message) {
  const msgUtf8 = new TextEncoder().encode(message);
  const hashBuffer = await crypto.subtle.digest('SHA-256', msgUtf8);
  const hashArray = Array.from(new Uint8Array(hashBuffer));
  const hashHex = hashArray.map((b) => b.toString(16).padStart(2, '0')).join('');
  return hashHex;
}

async function findPrefix(hardness, nonce) {
  while (true) {
    let prefix = makeid(32);
    let digestHex = await digestMessage(prefix + nonce);
    if (digestHex.startsWith("0".repeat(hardness))) {
      return prefix;
    }
  }
}

function raceEndpoint() {
  return new Promise(function (resolve, reject) {
    makeRequest("GET", "/api/challenge/proof-of-work")
      .then(function (data) {
        let challengeData = JSON.parse(data);
        findPrefix(challengeData.hardness, challengeData.nonce)
          .then((prefix) => { resolve([prefix, challengeData["challenge-id"]]) } );
      })
      .catch(reject);
  });
}

Though, I suppose it is theoretically solvable without JavaScript. I could come up with a Perl or Python one-liner that would be attached to the form, and you'd copy into the terminal to do the proof-of-work, and then you could copy/paste the output into a form element. This is more work than I'm interested in doing at the moment, so I'm just hopeful that someone isn't visually impaired and averse to enabling JavaScript on my website.

I had previously planned to implement this in a completely stateless fashion. For the integral problem, the "answer" would be encrypted and MAC'd with gcrypt. Though, the guile-gcrypt bindings don't expose the symmetric cipher interface, so I had the option of either hacking on guile-gcrypt or generating some other bindings. I went with the other option, and wrote some bindings for Nettle.

;;; Copyright © 2019 - 2022 Jakob L. Kreuze <zerodaysfordays@sdf.org>
;;;
;;; This program is free software; you can redistribute it and/or
;;; modify it under the terms of the GNU General Public License as
;;; published by the Free Software Foundation; either version 3 of the
;;; License, or (at your option) any later version.
;;;
;;; This program is distributed in the hope that it will be useful,
;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;;; General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with this program. If not, see
;;; <http://www.gnu.org/licenses/>.

(define-module (jakob dynamic aes)
  #:use-module (gcrypt random)
  #:use-module (system foreign)
  #:use-module (rnrs bytevectors)
  #:export (new-captcha))

(define %libnettle (dynamic-link "libnettle"))
(define %aes256-rounds 14)

(define %aes-key-size (/ 256 8))
(define %aes-block-size (/ 128 8))
(define %tex-aes-key (gen-random-bv %aes-key-size))

;; Assume `data' is a bytevector of length `%aes-block-size'.
(define (aes-256-encrypt key data)
  (let ((dest (make-bytevector %aes-block-size))
        (%ctx (make-c-struct (map (lambda (_) uint32) (iota (* 4 (+ 1 %aes256-rounds))))
                             (map (lambda (_) 0) (iota (* 4 (+ 1 %aes256-rounds))))))
        (%aes-256-encrypt
         (pointer->procedure void
                             (dynamic-func "nettle_aes256_encrypt" %libnettle)
                             (list '* size_t '* '*)))
        (%aes-256-set-encrypt-key
         (pointer->procedure void
                             (dynamic-func "nettle_aes256_set_encrypt_key" %libnettle)
                             (list '* '*))))
    (%aes-256-set-encrypt-key %ctx (bytevector->pointer %tex-aes-key))
    (%aes-256-encrypt %ctx %aes-block-size (bytevector->pointer dest) (bytevector->pointer data))
    dest))

(define (aes-256-decrypt key data)
  (let ((dest (make-bytevector %aes-block-size))
        (%ctx (make-c-struct (map (lambda (_) uint32) (iota (* 4 (+ 1 %aes256-rounds))))
                             (map (lambda (_) 0) (iota (* 4 (+ 1 %aes256-rounds))))))
        (%aes-256-decrypt
         (pointer->procedure void
                             (dynamic-func "nettle_aes256_decrypt" %libnettle)
                             (list '* size_t '* '*)))
        (%aes-256-set-decrypt-key
         (pointer->procedure void
                             (dynamic-func "nettle_aes256_set_decrypt_key" %libnettle)
                             (list '* '*))))
    (%aes-256-set-decrypt-key %ctx (bytevector->pointer %tex-aes-key))
    (%aes-256-decrypt %ctx %aes-block-size (bytevector->pointer dest) (bytevector->pointer data))
    dest))

;; (utf8->string (aes-256-decrypt %tex-aes-key (aes-256-encrypt %tex-aes-key (string->utf8 "AAAABBBBC hello!"))))
;; => "AAAABBBBC hello!"

Obviously this isn't incredibly secure (we're loading private keys into regular memory, for example) but this is a fairly low-stakes cryptographic application.

That said, I ended up dropping the cryptography entirely, because there's no way to implement this correctly without state. You can imagine some sort of "expiry" date after which a captcha is invalidated that you can MAC, but that leaves a period of time between when the user (or adversary) solves the captcha and when it expires that they are free to submit as many comments as the want. Some state is necessary for realizing if we've seen a captcha before. So I maintain a hash table mapping challenge numerical IDs to solutions.

(define-record-type <id-queue>
  (make-id-queue mutex min-free-threshold free-ids allocated-ids)
  id-queue?
  (mutex              id-queue-mutex)
  (min-free-threshold id-queue-min-free-threshold)
  (free-ids           id-queue-free      set-id-queue-free!)
  (allocated-ids      id-queue-allocated set-id-queue-allocated!))

(define* (make-queue n #:key (min-free-threshold 32))
  "Construct a stateful queue for tracking captcha IDs

The parameter N specifies how many free IDs should initially be allocated. The
optional keyword argument MIN-FREE-THRESHOLD specifies when `dequeue-id!' should
iterate through the allocated list and free anything exceeding an
internally-defined `time-to-live-seconds'."
  (make-id-queue (make-mutex) min-free-threshold (iota n) (list)))

(define (append-to-free-queue! id queue)
  "Add ID to the end of the free list of QUEUE"
  (set-id-queue-free!
   queue
   (append! (id-queue-free queue) (list id))))

(define (remove-from-free-queue! id queue)
  "Remove ID from the free list of QUEUE"
  (set-id-queue-free! queue (delete! id (id-queue-free queue))))

(define (append-to-allocated-queue! id queue)
  "Add ID to the end of the allocated list of QUEUE"
  (set-id-queue-allocated!
   queue
   (append! (id-queue-allocated queue) (list (list id (current-time))))))

(define (remove-from-allocated-queue! id queue)
  "Remove ID from the allocated list of QUEUE"
  (set-id-queue-allocated!
   queue
   (filter! (lambda (x) (not (equal? id (car x))))
            (id-queue-allocated queue))))

(define (release-id! id queue)
  "Release ID to the free list of QUEUE"
  (with-mutex (id-queue-mutex queue)
    (assert (find (lambda (x) (equal? id (car x))) (id-queue-allocated queue)))
    (assert (not (member id (id-queue-free queue))))
    (append-to-free-queue! id queue)
    (remove-from-allocated-queue! id queue)))

(define (dequeue-id! queue)
  "Draw a random ID from QUEUE and mark it as allocated"
  (define time-to-live-seconds (* 20 60))
  (with-mutex (id-queue-mutex queue)
    ;; Initial pass to "unintrusively" free any stale IDs.
    (when (< (length (id-queue-free queue))
             (id-queue-min-free-threshold queue))
      (for-each
       (match-lambda
         ((id created-time)
          (when (>= (- (time-second (current-time))
                       (time-second created-time))
                    time-to-live-seconds)
            (remove-from-allocated-queue! id queue))))
       (list-copy (id-queue-allocated queue))))
    ;; If we're still over the threshold, we'll need to be more intrusive.
    ;; Ideally, this is avoided by rate-limiting.
    (when (< (length (id-queue-free queue))
             (id-queue-min-free-threshold queue))
      (let* ((to-take (- (id-queue-min-free-threshold queue)
                         (length (id-queue-free queue))))
             (to-free (map car (take (id-queue-allocated queue) to-take))))
        (set-id-queue-allocated! queue (drop (id-queue-allocated queue) to-take))
        (set-id-queue-free! queue (append! (id-queue-free queue) to-free))))
    (let* ((n (random (length (id-queue-free queue))))
           (id (list-ref (id-queue-free queue) n)))
      (remove-from-free-queue! id queue)
      (append-to-allocated-queue! id queue)
      id)))

And this solves most of the problems. Of course, you can adversarially fill the hash table with junk in hopes of evicting someone else's challenge, but my hope is that rate-limiting will prevent that from happening, in general.

The Future of Comments

This is just the first implementation of the system. Seeing how it performs in practice will probably motivate some changes later on. Already, though, there are a few "to do" items that I have in mind.

The ability to reply to other comments is the main feature I think I overlooked. I've added this to the database schema so that I don't have to deal with migrations when I come back to it. That'll require that I render comments as a tree rather than a list, which I expect to be a nontrivial change.

There's also a reaction capability, which is implemented, but right now it's only exposed via an API endpoint – there's no nice button that you can click on to react to a comment. Between now and when I get around to implementing the UI, you can manually hit the endpoint all you want. I don't care. Though, it is severely rate-limited.

Finally, the captcha system is pretty subpar. It was a good idea in my head, but my algorithm isn't smart enough to choose constants that are nice to work with. In particular, to actually evaluate the integral, it's likely that you need to calculate awful expressions like $\sin(7)$. I'll either write a better algorithm, or drop the calculus idea entirely and opt for something a little less rote. Perhaps identifying equal angles in a diagram, or pwning a vulnerable binary running in a v86 virtual machine.

On The Choice to Use Postgres

I already had a database instance for Pleroma. If it wasn't already there on my server, I would have considered SQLite.

Fitting This Together with Haunt

For the same performance reason I gave for having the captcha behind a button that you need to click, I wanted to keep Haunt around. I also just think it's a great piece of software. This site is still generated by Haunt – there just happens to now be a separate component for implementing a bit server-side logic, and Apache will forward traffic to that separate component if it's a request for something besides the static content that makes up the majority of my site.

The nice thing about Haunt is that "sites," or configurations in Haunt, are just Guile programs. Because I chose Guile for the "dynamic" part of my website, I was able to import most of the code I'd written for my Haunt configuration. So far, I've only used this in one place: the server-side rendered comment form. I'm able to import the function I use in Haunt to add the navigation bar, footer, and CSS, and use that for the form, as well as all of the SXML utility functions I wrote.

To give a very underwhelming conclusion: being able to import Haunt stuff and use it to generate the SXML that's output by a Guile (web server) program is pretty cool.

Footnotes:

1

Of course, the times someone's shown up uninvited, they didn't RSVP in the first place. In retrospect, it was a bit silly to think that this would work the way I expected it to.

2

The.. only?

3

According to Apache logs, at least. I don't use any sort of analytics. If you're really curious, it was my first post about Rust.

4

As an anecdote, I had folks reach out about the post via email, which I see this as far less convenient than filling out a comment form. Hence, I conclude that it's likely I removed the comment functionality by this point.

5

If you read Aaron's article that I linked above, you'll see that writing a Webmention is very technical. You need a website of your own, and enough patience to generate a bunch of complicated HTML, just to write a comment.

Comments for this page

  • Icon for comments posted on jakob.space
    Peter(127.0.0.1)

    best captcha ever!

    • Icon for comments posted on jakob.space
      ckie()

      the comment form looks quite overwhelming to me. needs less input fields and buttons! also it'd be neat if the captcha appeared once the user started interacting with a form field. overall, nice post (: it'd be neat if this was a resizeable textarea, and I just clicked the captcha and it's hilarious :D

      • 👆 (1)
      • 😄 (1)
    • Icon for comments posted on jakob.space
      Pard()

      The react box is too small to even shows a single row of emojis completely. It'd be nice if it had a handle like textboxes do to make the box larger.

      • Icon for comments posted on jakob.space
        me()

        I think the no-js commenting is broken, I am getting: Resource not found: /api/comment-form/pushing-haunt-to-its-limits on the page I am redirected to (https://jakob.space/api/comment-form/pushing-haunt-to-its-limits). As for the article, interesting read :) For the images, one possible solution without introducing rust could be X-Sendfile header.

        • Icon for comments posted on jakob.space via I2P;
The I2P logo belongs to The I2P Project, and is licensed under the CC BY 4.0
          Jakob(mtgqjdrqqpovyh...dd3hnq.b32.i2p/)

          Also I2P!

          • Icon for comments posted on jakob.space via Tor;
The Tor logo belongs to The Tor Project, Inc. and is licensed under the CC BY 3.0 US
            Jakob(jakobyallfrbd3...yiwp3adid.onion)

            The comment endpoints are now available via Tor.

            • Icon for comments posted on jakob.space
              theorytoe(theoryware.net)

              That captcha is hilarious, was laughing for a solid minute. Ive been wanting to incorporate a comments system on my site for some time now. I found out about remark42 some time ago, but that really did not work for my needs. However with a similar system I might be able to get something working.

              • 🦧 (1)
              • 🦃 (1)
              • 😂 (1)

            Click here to write a comment on this post.