Frogtoss Labs - title image with flowers
*

A Convention For Fragment Parsers in C

Fri 09 August 2024
Michael Labbe
#code 

Sometimes you want to parse a fragment from a string and all you have is C. Parsers for things like rfc3339 timestamps are handy, reusable pieces of code. This post suggests a convention for writing stack-based fragment parsers that can be easily reused or composed into a larger parser.

It’s opinionated, but tends to work for most things so adopt or adapt to your needs.

The Interface

The idea is pretty simple.

// can be any typetypedef struct {  // fields go here} type_t;int parse_type(char **stream, size_t len, type_t *out);

Pass in a **stream pointer to a null-terminated string. On return, **stream points to the location of an error, or past the end of the parse on success. This means that it can point to the null terminator.

Pass in the length of the string to parse to avoid needing to call strlen, or to indicate if the end of a successful parse occurs before the null terminator.

Return can be an int as depicted, or an enum of parse failure reasons if not. The key thing is that zero is success. This allows multiple parses to OR the results and test for error once for trivial code.

That’s the whole interface. You can compose a larger parser out of smaller versions of these. So, if you want to parse a float (a deceptively hard thing to do) in a document, or key value pairs with quotes or something, you can build, test and reuse them by following this convention.

Helping with Implementation

When you implement a fragment parser you end up needing the same few support functions. This suggests a convention.

Testing for whether the stream was fully parsed works well works with a macro containing a single expression:

#define did_fully_parse_stream \    (*stream - start == (ptrdiff_t)len)int parse_type(char **stream, size_t len, type_t *out) {    char *start = *stream;    if (!did_fully_parse_stream)        return 1;}

Token Walking

Test the next token for a match:

static int is_token(const char **stream, char ch) {    return **stream == ch;}

Test the next token and bypass it if it matches. By convention, use this if a token failing to match is not an error.

static int was_token(const char **stream, char ch) {    if (is_token(stream, ch)) {        (*stream)++;        return 1;    }    return 0;}

Test the next token to be ‘ch’, returning true if it is. While this functionally does the same thing as was_token, it is semantically useful to use it to mean an error has occurred if it does not match.

static int expect_token(const char **stream, char ch) {    return !was_token(stream, ch);}

Token Classification

Token classification is very easy to implement using C99’s designated initializers. A zero-filled lookup table can be used to test token class and to convert tokens to values.

static char digits[256] = {    ['0'] = 0,  ['1'] = 1,  ['2'] = 2,  ['3'] = 3,  ['4'] = 4,    ['5'] = 5,  ['6'] = 6,  ['7'] = 7,  ['8'] = 8,  ['9'] = 9,};void func(){    // is it a digit?    if (digits[**stream]) {       // yes, convert token to stored integral value       int value = digits[**stream];    }    // skip token stream ahead to first non-digit    while (digits[**stream]) (*stream)++;}
*

The Coming Egalitarian Wave of Computing

Fri 02 August 2024
Michael Labbe
#rant 

Recently I had a conversation with a composer who was planning on buying a $5,499 Mac Studio to record music. “It’s the only computer I’ll need to run all of my VSTs and play back all of my tracks”, he remarked. With 24 cores and 64GB of RAM, it sure seemed likely to me. “Are you sure you couldn’t do that on a MacBook Air?” I prompted, genuinely curious about where the resources were going. He seemed taken aback that it might even be a possibility.

Whether or not he needed the extra headroom — and you can make the argument that you would weigh down a lighter recording computer with VSTs and track layering — it was a good reminder that marketing like Apple’s makes people equate professional significance with higher end devices. Today, most base model CPUs are good enough for most people. Most professionals do not quantify their computing needs before making a purchase, and so many computers being sold are unnecessarily overpowered. Device marketing encourages this. Over the past couple of decades we benefitted from those gains but I don’t believe it’s true anymore for many tasks if you choose the right software.

A hobby of mine is to achieve my intended computing result with the least amount of computing power and dollars I reasonably can. For example, this blog post is being written on a refurbished $250 Thinkpad humming along on a Linux Mint MATE desktop running only Emacs.

It is refreshing to not be precious about an expensive laptop, and to be able to just toss it in a bag. Unlike modern buggy gaming laptops that lack ports, it also wakes from sleep with 100% consistency. I still own a highend workstation, but I am finding many computing needs can be covered by devices that shipped 5+ years ago: browsing, editing, messaging, some music composing and even coding smaller scale (read: not Unreal) projects.

There Will be Plenty

Microsoft has announced the end of life of Windows 10 on October 14, 2025. Many highly capable computers including Threadrippers, Dell XPS laptops and other older highend configurations will be unable to run Windows 11 in a secure, supported way. That said, Steam Hardware Survey, as of this writing, has Windows 10 counted as more popular than 11.

This situation has created a rising tension, and one of two things is likely to happen:

  1. Microsoft is playing chicken, and will extend the free support of Windows 10 for a year or two, or
  2. There will be a massive selloff of computers that no longer run any supported version of Windows

Next year will be a great opportunity to pick up refurbished hardware that can do most computing tasks after doing an install of FreeBSD or a Linux distribution. Linux Mint is my preference — it feels supportive of the user like Windows 2000 did, has no obvious subversive agenda, is Ubuntu package compatible, and is entirely snappy on lowend hardware that is slated for deprecation by Microsoft.

Turn Microsoft’s e-waste into your next workhorse computer.

The Consumer AI Wave is Building and I’m Keeping Dry

On-device AI is being shoehorned where it has no business going because it is perceived as being able to push tech company valuations. It is being foisted on consumers whether they understand it or not. Meanwhile, we are being told we have to upgrade to new processors and operating systems to receive these fun new experiences.

There is a lot to be said about AI, but as far as my computing device goes — I’m totally fine with staying on the beach while the corporate agendafied first wave hits everybody who jumps in the water. Consumer on-device AI is not going to be a part of my professional workflows until the waters have settled and the hype has passed.

The refurbished device market stands to become very saturated if AI features motivate users to abandon their existing computers. Buy the dip!

By using a refurbished device on Linux you are virtually guaranteed to avoid the first generation of consumer on-device AI which is likely to involve annoying or even dangerous missteps.

It Is Freeing Not To Love

Users coming from Apple to other operating systems seem to demonstrate a sensibility — they want to love their new PC, tablet or phone. This is because the device is the nexus of the experience in the Apple world. The hardware is second to none and you can end up experiencing entirely bug free workdays if you stay on a well-manicured path. Jumping from a Macbook Pro to an unconfigured Thinkpad on Ubuntu would be like going from an Americano with cream to gritty camping coffee prepared with a hangover.

A shift in mindset about what matters is helpful. I have found it productive to not focus on the device so much as I focus on getting the result I am looking for in my work. Loving the hardware is not the point, and it can be freeing to find the workflow that gets you the result you need outside of loving a device. Imagine how much you can achieve outside the binds of device love!

October 2025 is looking like a great time to pick up a dirt cheap first generation 16-core Threadripper, install Linux on it and have it perform phenomenally for a decade or longer. Now, if Microsoft could just deprecate some of those previous-gen GPUs…

*

Housekeeping: RSS is Improved

Sat 22 June 2024
Michael Labbe
#meta 

Just a quick note to say I improved Pelican’s RSS generator for Labs. You can now read full articles in your RSS reader if you subscribe to this blog. Previously they were truncated, which forced users to go to the site. Now you can read the posts anywhere you want.

I also cropped the number of posts in RSS down to five so RSS readers will not need to mark a ton of really old posts as read. There has never been a better time to subscribe. :)

Page 2 of 12