Blog :: Eschewing An Established Convention Because We Can
Nearly every programming language has some flavor of "structure" declaration, where the programmer can conjure a new type representing the composition of other primitive types or structures, neatly organized into named fields. While the concept is practically universal, the way the programmer defines and interacts with these types can vary dramatically between languages. In Scheme, such structures are called "records". I found that the existing facilities left a lot to be desired, so I wrote a few macros to provide an experience more akin to Common Lisp.
There are two standards we care about when it comes to how records ought to look in Scheme. R⁶RS has a very comprehensive way of defining records, which goes as far as to specify inheritance mechanisms. Despite Guile supporting R⁶RS, I don't think I've run into any Guile code that uses them. I've only seen programmers reach for the SRFI-9 API, which is what R⁷RS chose to standardize on. SRFI-9 is very minimal and, as a result, very verbose. For every record definition, in addition to the name of the record type and the names of its individual fields, the programmer has to spell out:
- The name of the constructor
- The name of the predicate function
- The names of the accessor procedures for every type
- The names of the mutator procedures for every type
A record containing two fields looks like this:
(define-record-type <person>
(make-person height weight)
person?
(height person-height)
(weight person-weight))
There is some elegance in this, which arises from the fact that this record system is very easily realized in R⁵RS Scheme. Including comments, the reference implementation provided with the SRFI is fewer than 200 lines of code. This is where the elegance ends, however, as use of this interface quickly devolves into a great deal of repeating oneself. Changing a field involves updating the constructor, updating the corresponding field specification, and replacing 2-3 symbols across my program. My primary criticism, however, is how inconvenient it is to actually use an instance of a record. In writing my RSVP app, I made several efforts to ameliorate the deficiencies of the SRFI-9 records interface, none of which I was particularly happy with at the time. To demonstrate, I've rewritten a small snippet of the program.
(define-record-type <rsvp-create>
(make-rsvp-create-parameters)
rsvp-create-parameters?
(invitation-code rsvp-create-code)
(name rsvp-create-name)
(email rsvp-create-email)
(attending rsvp-create-attending)
(guests rsvp-create-guests))
(define (create-new-event-rsvp params)
"Handler for RSVP'ing to an event."
(let* ((receipt-code (generate-vanity-code))
(invitation-code (rsvp-create-code params))
(name (rsvp-create-name params))
(email (rsvp-create-email params))
(attending (rsvp-create-attending params))
(guests (rsvp-create-guests params))
(event-id (car (invitation->event-id invitation-code))))
(exec-query conn "INSERT INTO rsvps (vanity, invitation_id, event_id,
fullname, email, attending, guests) VALUES ($1, $2, $3, $4, $5, $6, $7)"
(list receipt-code invitation-code event-id name email
attending guests))
(values '((content-type . (text/html)))
(sxml->html-string
(theme #:title "Thanks for RSVPing!"
#:content (render-event-rsvp-receipt receipt-code))))))
Serializing a record to a database is something that arises frequently in any sort of "back end" type code, in which I often find myself referencing all or most of the fields in the record. Writing code like the above is as unpleasant as writing code which destructures a list with car, cadr, and the other nonsensical amalgamations of the alphabet, except that I find it even more bothersome. I'm repeating the name of every field twice, repeating the name of the type, and repeating the name of the variable I'm trying to destructure.
In contemporary Scheme, as well as in many of its successor languages, syntactic destructuring is the abstraction of choice to dispel tortuous use of car and cdr. Guile provides (ice-9 match) for this, which conveniently also supports the destructuring of records.1
(define (create-new-event-rsvp params)
"Handler for RSVP'ing to an event."
(match params
(($ <rsvp-create> invitation-code name email attending guests event-id)
(let ((receipt-code (generate-vanity-code))
(event-id (car (invitation->event-id invitation-code))))
(exec-query conn "INSERT INTO rsvps (vanity, invitation_id, event_id,
fullname, email, attending, guests) VALUES ($1, $2, $3, $4, $5, $6, $7)"
(list receipt-code invitation-code event-id name email
attending guests))
(values '((content-type . (text/html)))
(sxml->html-string
(theme #:title "Thanks for RSVPing!"
#:content (render-event-rsvp-receipt receipt-code))))))))
This is certainly much easier to type than explicit invocation of the accessor functions, but I still find it somewhat unwieldy for all the logic to live in the match arm. Also, this procedure will only be called with an instance of <rsvp-create>, and I would like to make that immediately clear. At least when I read code, I see a match expression and expect it to represent a logical branch, rather than simply accessing the fields of something.
Common Lisp, specifically the Common Lisp Object System (CLOS) provides a macro called with-accessors to do something similar to what we did with the let form, albeit in more concise notation. Thanks to Scheme's flexibility, we can define our own crude approximation of the CL macro.
(define-syntax with-accessors
(syntax-rules ()
((with-accessors rtd (field ...)
body ...)
(let ((rtd-eval rtd))
(let ((field ((record-accessor (record-type-descriptor rtd) 'field) rtd))
...)
body ...)))))
(define-syntax call-with-accessors
(syntax-rules ()
((call-with-slots rtd
(lambda (field ...)
body ...))
(let ((rtd-eval rtd))
(let ((field ((record-accessor (record-type-descriptor rtd) 'field) rtd))
...)
body ...)))))
Thus, we arrive at our desired procedure.
(define (create-new-event-rsvp params)
"Handler for RSVP'ing to an event."
(with-accessors params (invitation-code name email attending guests event-id)
(let ((receipt-code (generate-vanity-code))
(event-id (car (invitation->event-id invitation-code))))
(exec-query conn "INSERT INTO rsvps (vanity, invitation_id, event_id,
fullname, email, attending, guests) VALUES ($1, $2, $3, $4, $5, $6, $7)"
(list receipt-code invitation-code event-id name email
attending guests))
(values '((content-type . (text/html)))
(sxml->html-string
(theme #:title "Thanks for RSVPing!"
#:content (render-event-rsvp-receipt receipt-code)))))))
I learned Common Lisp before I learned Scheme2, so I first learned about macros through defmacro. Macro definitions with defmacro are essentially functions that treat Scheme syntax as lists.
(defmacro (when cond exp . rest)
`(if ,cond
(begin ,exp . ,rest)))
This is a simple example as we're just substituting variables into a list, but the body of a defmacro can contain almost arbitrary logic for constructing the list. The result will be treated as syntax in the Scheme code, substituted into the s-expression tree of the program where the macro invocation occurs. Guile does have defmacro, but the facilities that are standardized in rⁿrs are syntax-rules and syntax-case, syntax-rules being the simpler of the two. It looks a bit like match, but the match arms are the s-expression syntax you would want the macro invocation to expand to, written out as it would be in source code. In that sense, you can also think of it as a function of syntax, but with limited support for doing anything more complicated than substituting symbols into a form already known to Scheme.
(define-syntax when
(syntax-rules ()
((when cond exp ...)
(if cond
(begin exp ...)))))
The philosophy underpinning such a limited facility is that Scheme is concerned with "hygiene." Because defmacro treats syntax as merely s-expressions containing symbols, the programmer can produce syntax with arbitrary symbols that may not correspond to anything within the macro invocation. This introduces new variables into the scope that the end-user may not expect, potentially shadowing variables in the enclosing scope. The implementation of with-accessors that follows, for example, introduces a new rtd-eval variable which ostensibly exists in the scope in which body is evaluated. Guile actually prevents us from shooting ourselves in the foot, though, and recognizes that rtd-eval is from us, not the user of the macro, and replaces it with (gensym), which generates a new symbol that has never been interned by the interpreter. Common Lisp makes little attempt to determine whether a symbol is an artifact of the macro logic or an input from the end-user, so (gensym) is a common sighting in CL macros.
(defmacro with-accessors (rtd fields . body)
`(let ((rtd-eval ,rtd))
(let ,(map (lambda (f)
`(,f ((record-accessor (record-type-descriptor rtd-eval) ',f) rtd-eval)))
fields)
,@body)))
syntax-rules and syntax-case avoid this because they don't treat syntax as lists where the macro programmer can introduce arbitrary symbols, but as "syntax objects" which carry information about the scope that the macro was invoked in. defmacro actually does this under the hood as well, which is why the defmacro definition of with-accessors is actually perfectly fine in
Scheme. A direct translation to CL would be erroneous.
This unfortunately prevents us from having something like (define-struct <request> method uri query) expand into
(define <request> (make-record-type "request" '(method uri query)))
(define make-request (record-constructor <request>))
(define request? (record-predicate <request>))
The logic for such an expansion can absolutely be represented using syntax-case, but the hygiene system in Guile will replace the names of the top-level definitions with (gensym), so we can't actually use these definitions outside of the macro definition. We can re-think this a little, though.
(define-syntax define-struct
(lambda (stx)
(syntax-case stx ()
((_ name field* ...)
#'(define name
(let* ((type-string (symbol->string (quote name)))
(type-representation (match:substring
(string-matche "[^<> ]+" type-string))))
(make-record-type (symbol->string (quote name))
(quote (field* ...)))))))))
(define (make-struct type . args)
(apply (record-constructor type) args))
The difference here is that the name for the generated definitions comes from the macro invocation. That's perfectly fine, it's when arbitrary symbols sneaks in where it's a hygiene problem. There's a post on Jakub Jankiewicz's blog where he expresses a similar position on Guile records and provides a macro definition similar to what we started with, but I think it comes from a time before the current hygiene system, as I wasn't able to get it to work in Guile 3.0.11.3
What I've thus far neglected to mention is my choice of representation for records, which is based on the low-level interface to records provided by Guile. Standard Scheme implementations won't necessarily expose procedures like record-accessor or record-constructor. I'm only using this for Guile, though, so I'm okay with that. I also want something simple; I don't need all the bells and whistles that come from the Guile Object Oriented Programming System (GOOPS) library.
Besides syntax-rules and syntax-case, Guile has one other kind of macro called a "reader macro", which allows us to extend the lexer/parser part of the language and create new syntax that doesn't look like a s-expression. So far, we've defined ways to define new record types, and to destructure them, but since we didn't put anything in the define-struct macro to generate accessor functions like SRFI-9 does, we're missing a nice way to access just one particular field in a record. Let's do that with a reader macro.
(define* (ref obj #:rest keys)
(match keys
(() obj)
((key rest ...)
(apply ref
((record-accessor (record-type-descriptor obj) key) obj)
rest))))
(read-hash-extend #\.
(lambda (chr port)
(let* ((sym (read port))
(str (symbol->string sym))
(parts (string-split str #\.)))
(let ((symbols (map string->symbol parts)))
(let ((base-obj (car symbols))
(fields (cdr symbols)))
(let ((quoted-fields (map (lambda (f) `',f) fields)))
`(ref ,base-obj ,@quoted-fields)))))))
read-hash-extend is just like defmacro, in the sense that it's a function that returns a s-expression representing syntax. Now we can write #.x.y.z, which would behave like how x.y.z would in C or Rust. I think with-accessors is more readable, though.
(define (create-new-event-rsvp params)
"Handler for RSVP'ing to an event."
(let ((receipt-code (generate-vanity-code))
(event-id (car (invitation->event-id #.params.invitation-code))))
(exec-query conn "INSERT INTO rsvps (vanity, invitation_id, event_id,
fullname, email, attending, guests) VALUES ($1, $2, $3, $4, $5, $6, $7)"
(list receipt-code
#.params.invitation-code
event-id
#.params.name
#.params.email
#.params.attending
#.params.guests))
(values '((content-type . (text/html)))
(sxml->html-string
(theme #:title "Thanks for RSVPing!"
#:content (render-event-rsvp-receipt receipt-code))))))
—
Footnotes:
Unfortunately, I discovered the destructuring capability of match long after I wrote my RSVP system.
At least, to a level that I felt I could effectively organize a program of moderate complexity. Structure and Interpretation of Computer Programs was my first introduction to the Lisp family, but my compulsion to complete every single exercise, combined with having very finite time outside of academics or my profession, meant that I never made it past the third chapter.
Dated May 17, 2010, which was a few months after the release of Guile 2.0. It's very possible that those macros were written for Guile 1.8.
Comments (0)
Leave a Comment