For Good Measure

How to displace JS

by Colby Russell. 2019 March 6.

JS has gotten everywhere. It drives the UI of most of the apps created to run on the most accessible platform in the world (the web browser). It has been uplifted into Node and Electron for widespread use on the backend, on the command-line, and on the desktop. It's also being used for mobile development and to script IOT devices, too.

So how did we get here? Let's review history, do some programmer anthropology, and speculate about some sociological factors.

JS's birth and (slightly delayed) ascent begins roughly contemporaneous with its namesake—Java. Java, too, has managed to go many places. In the HN comments section in response to a recent look back at a 2009 article in IEEE Spectrum titled "Java’s Forgotten Forebear", user tapanjk writes:

Java is popular [because] it was the easiest language to start with https://news.ycombinator.com/item?id=18691584

In the early 2000s in particular, this meant that you could expect to find tons of budding programmers adopting Java on university campuses, owing to Sun's intense campaign to market the language as a fixture in many schools' CS programs. Also around this time, you could expect its runtime—the JRE—to be already installed on upwards of 90% of prospective users' machines. This was true even when the systems running those machines were diverse. There was a (not widely acknowledged) snag to this, though:

As a programmer, you still had to download the authoring tools necessary for doing the development itself. So while the JRE's prevalence meant that it was probably already present on your own machine (in addition to those of your users), its SDK was not. The result is that Java had a non-zero initial setup cost for authoring even the most trivial program before you could get it up and running and putting its results on display.

Sidestepping this problem is where JS succeeded.

HTML and JS—in contrast to not just Java, but most mainstream programming tools—were able to drive the setup cost down to zero. When desktop operating systems were still the default mode of computing, you could immediately go from non-developer to developer without downloading, configuring, or otherwise wrangling any kind of SDK. This meant that in addition to being able to test the resulting code anywhere (by virtue of a free browser preinstalled on upwards of 97% of consumer and business desktops), you could also get started writing that code without ever really needing anything besides what your computer came with out of the box.


You might think that the contemporary JS dev ecosystem would leverage this—having started out on good footing and then having a couple decades to improve upon it. But weirdly, it doesn't work like that.

JS development today is, by far, dominated by the NodeJS/NPM platform and programming style. There's evidence that some people don't even distinguish between the NodeJS ecosytem and JS development more generally. For many, JS development probably is the NodeJS ecosystem and the NodeJS programming style is therefore seen as intrinsic to the way and form of JS.

In the NodeJS world, developers working in this mindest have abandoned one of the original strengths of the JS + browser combo and more or less replicated the same setup experience that you deal with on any other platform. Developer tunnel vision might trick a subset of the developers who work in this space into thinking that this isn't true, but the reality is that for NPM-driven development in 2019, it is. Let's take a look. We'll begin with rough outline of a problem/goal, and observe how we expect to have to proceed.

Suppose there's a program you use and you want to make changes to it. It's written for platform X.

With most languages that position themselves for general purpose development, you'll start out needing to work through an "implicit step 0", as outlined above in the Java case study. It involves downloading an SDK (even if that's not what it's called in those circles), which includes the necessary dev tools and/or maybe a runtime (subject to the implementation details of that platform).

After finding out where to download the SDK and then doing exactly that, you might then spend anywhere from a few seconds or minutes to what might turn out to be a few days wrestling with it before it's set up for your use. You might then try to get a simple "hello, world"-style program on the screen, or you might skip that and dive straight into working on the code for the program that you want to change.

Contemporary JS development really doesn't look all that different from this picture—even if the task at hand is to do "frontend" work meant to run in the browser—which was the predominant use of JS early in its lifetime, when it still did have zero setup cost.


I have a theory that most people conceptualize progress as this monotonically increasing curve over time, but progress is actually punctuated. It's discrete. And the world even tolerates regress in this curve. If engaged directly on this point, we'd probably find that for the most part any given person will readily acknowledge that this is the true character of that curve, but when observed from afar we'll see that most are more likely to appear as if in a continual motte-and-bailey situation with themselves—that their thoughts and actions more closely resemble that of a person who buys into the distorted version of progress, despite the ready admission of the contrary.

Steve Klabnik recently covered the idea of discrete and punctuated progress in his writeup about leaving Mozilla:

at each stage of a company’s growth, they have different needs. Those needs generally require different skills. What he enjoyed, and what he had the skills to do, was to take a tiny company and make it medium sized. Once a company was at that stage of growth, he was less interested and less good at taking them from there. https://words.steveklabnik.com/thank-u-next

The corollary to Steve's boss's observation is that there's stuff (people, practices, et cetera) present in later phases and to which we can directly attribute the success and growth during that phase, but that these things could have or would have doomed the earlier phases. It seems that this is obviously true for things like platforms and language ecosystems and not just companies.


To reiterate: JS's inherent zero-cost setup was really helpful in the mid-to-late 2000s. That was its initial foot in the door, and it was instrumental in helping JS reach critical mass. But that property hasn't carried over into the phase where devs have graduated to working on more complex projects, because as the projects have grown in complexity, the tooling and setup requirements have grown, too.

So JS, where it had zero setup costs before, now has them for any moderate-to-large-scale project. And the culture has changed such that its people are now treating even the small projects the same as the complex ones—the first thing a prospective developer trying to "learn to code" with JS will encounter is the need to get past the initial step zero—for "setting up a development environment". (This will take the form of explicit instructions if the aspiring developer is lucky enough to catch the ecosystem at the right time and maybe with the help of a decent mentor, or if the aspiring developer is unlucky and the wind isn't blowing in a particularly helpful way, then it may be an implicit assumption that they will manage to figure things out.) And developers who have experience working on projects at the upper end of the complexity spectrum also end up dealing with the baggage of implicit step zero for their own small projects—usually because they've already been through setup and are hedging with respect to a possible future where the small project grows wildly successful and YAGNI loses to the principle of PGNI (probably gonna needed it).

Jamie Brandon in 2014 gave some coverage to this phenomenon on the Light Table Blog (albeit from the perspective of Clojure) in a post titled "Pain We Forgot".

To pull an example from the world of JS, let's look at the create-react-app README, which tells you to run the commands:

npx create-react-app my-app
cd my-app
npm start

What assumptions does it make? First, that you're willing to, able to, and already have downloaded and installed NodeJS/NPM to your system; secondly, that you've gone through the process of actually running npm install create-react-app, and that you've waited for it to complete successfully.

(You might interject to say that I'm being overly critical here—that by the time you're looking at this README, then you're well past this point. That's the developer tunnel vision I referred to earlier.)

Additionally I'll note that if we suppose that you've started from little more than a blank slate (with a stock computer + NodeJS/NPM installed), then creating a "hello, world" app by running the following command:

npm install create-react-app && npx create-react-app foo

... will cost you around 1.5 minutes (in the best cases) while you wait for the network and around half a GB of disk space.


If JS's early success in the numbers game is largely a result of a strength that once existed but is now effectively regarded as non-existent, does that open up the opportunity for another platform to gobble up some easy numbers and bootstrap its way to critical mass?

Non-JS, JS-like languages like Haxe and Dart have been around for a while and are at least pretending to present themselves as contenders, vying for similar goals (and beyond) as what JS is being used for today. And then there are languages nothing like JS, like Lua, which has always touted its simplicity. And then there is the massive long-tail of other languages not named here (and that possibly haven't even been designed and implemented yet).

What could a successful strategy look like for a language that aimed to displace JS?

If you come from a JS background, you might argue that you still have the option of foregoing all the frameworks and tooling which have obviated JS's zero setup strength. As I alluded to before, though: rarely does anyone actually run a project like that. So while "hello, world" is still theoretically easy, the problem is twofold:

Which is to say, that in either case as a developer, you're going to run into this stuff, because it's what people are pushing in this corner of the world. And therefore the door is wide open for a contender to disrupt things.

So the question is whether it's possible to contrive a system (a term I'll use to loosely refer to something involving a language, an environment, and a set of practices) built around the core value that zero-cost setup is important—even if the BDFL and key players only maintain that stance up to the point where the ecosystem has reached a similar place as contemporary JS in its development arc. Past this point, it would be a free option to abandon that philosophy, or—in order to protect the ecosystem from disruption by others—to maintain it. It would be smart for someone with these ambitions to shoot for the latter option from the beginning and take the appropriate steps early to maintain continuity through all its phases, rather than having to bend and make the same compromises that JS has.

I didn't set out when writing this post to offer any solutions or point to any existing system, as if to say, "that's the one!". The main goal here is to identify problems and opportunities and posit, Feynman-style (cf There's Plenty of Room at the Bottom), that there's low-hanging fruit here, money on the table, etc.

What happened in January?

by Colby Russell. 2019 February 15.

Unlaunched

Last summer, I began work on a collection of projects with a common theme. The public face of those efforts was and is meant to be triplescripts.org.

But wait, if you type that into the address bar, it's empty. (If you're reading from the future, here's an archive.org link of the triplescripts front page as it existed today.) So what's the deal?

In December, I set the triplescripts.org launch date for January 7. This has been work that I'm genuinely excited about, so I was happy to have a date locked down for me to unveil it. (Although, as I mentioned on Fritter, I've been anticipating that it will succeed through tenacity—a slow burn, rather than an immediate smash hit.)

Starting in the final week before that date, a bunch of real life occurrences came along that ended up completely wrecking my routine. Among these—and the main thing that is still the most relevant issue as I write this now—is that I managed to get sick three times. That's three distinct periods with three different sets of symptoms, and separate, unambiguous (but brief) recoveries in between. So it's now a month and a week after the date I had set for launch, and triplescripts.org has no better face than the blurby, not-even-a-landing-page that I dumped there a few months back, and these ups and downs have me fairly deflated. Oh well for now. Slow burn.


Unloading a month's thoughts

The title of this post is a reference to Nadia Eghbal's monthly newsletter, which has been appearing in the inbox under the title "Things that happened in $MONTH". I like that. Note, in case you're misled by bad inference on the title, that the newsletter is about ideas, not autobiographical details.

I've seen some public resolutions, or references to resolutions by others, to publish more on personal sites and blogs in 2019 (such as this Tedium article on blogging in 2019). I don't make New Year's resolutions, so I was not among them. But I like the tone, scope, and monthly cadence in the idea behind "Things that happened". So on that note alone—and not motivated by a tradition of making empty promises for positive change when a new year begins—I think I will commit to a new outlook and set of practices about writing that follows in the vein of that newsletter.

The idea is to publish once a month, at minimum, everything that I considered that month as having been "in need of a good writeup", and to do so regardless of the state it's actually reaches by the end of the month—so something on the topic will go out even if it never made it to draft phase. Like continuous integration for written thought.

Although when you think about it, what's with all the self-promises, of, you know, writing up a thorough exegesis on your topic in the first place? Overwhelming public sentiment is that there's too much longform content. As even the admirable and respectable Matt Baer of write.as put it, "Journalism isn't dead, it just takes too damn long to read." (Keep in mind this is from the mouth (hand) of a man whose main business endeavor at the moment hinges on convincing people to write more.) And this is what everyone keeps saying is the value proposition of Twitter, anyway, right? High signal, low commitment, and low risk that you'll end up snoring.

Ideas are what matter, not personal timelines. I mentioned above that Nadia's newsletter is light on autobiographical details, as it should be. Sometimes I see that people aren't inspired to elaborate on any particular thought, but find themselves in a context where they're writing—maybe as a result of some feeling of obligation—so they settle into relaying information about how they've spent themselves over some given time period—information that even their future self offset a couple months down the line wouldn't find interesting. So these monthly integration dumps will remain light on autobiographical details, except in circumstances where those details fulfill some supporting role to set the scene or otherwise better explain the idea that's in focus.


Unlinked identity

I'm opposed to life logs in general. I hate GitHub contribution graphs, for example, because they're just a minor variation of the public timeline concept from any social network, and I've always disliked those. This is one reason I never fully got on board with Keybase.

Keybase's social proofs are pretty neat, the addressing based on them is even neater, and in general I feel some goodwill and positive thought toward what I perceived as Keybase's aspirations towards some sort of yet-to-be-defined integration point as your identity provider. But I realized a thing a few months after finally signing up for Keybase, which is that their implementation violates a personal rule of mine: participation in online communities originates from unlinked identities, always.

When I was throwing my energy into Mozilla (and Mozilla was throwing its weight in the direction of ideals it purported to be working for), Facebook Connect was the big evil. The notion that the way to participate—or, as in the worst cases, even just to consume—could happen only if you agreed to "Sign in with Facebook" (and later, with Google; nowadays it's Twitter and GitHub), was a thing unconscionable. BrowserID clearly lost, but the arguments underpinning its creation and existence in the first place are still valid.

I'm not sold by the pitch of helping me remember how I spend my time. I'm not interested in the flavor of navelgazing that you get from social networks giving you a look back at yourself N months or years down the line. And we should all be much less interested, further still, in the way that most social networks' main goal is to broadcast those things to help others get that kind of a look at you, too.

Look at it like this: if you and I work together—or something like that—then that's fine. You know? That's the context we share. If I go buy groceries or do something out in public and we happen run into each other, that would be fine, too. But if one of my coworkers sat outside my place to record my comings and goings, and then publicized that info to be passively consumed by basically anyone who asked for it, then that would not be okay.

My point is, I like the same thing online. If I'm contributing to a project, for example, I'm happy to do so with my real name. If you're in that circle (or even just lurking there) and as a result of some related interest you run into my name in some other venue, then, hey, happy coincidence. But I'm less interested in giving the world a means to click through to my profile and find a top-level index of my activity—and that's true without any desire to hide my activity or, say, my politics, as I've seen in some cases. After all, if that were the goal, it would be much easier just to use a pseudonym.

So I say this as a person with profoundly uninteresting comings and goings— but I realize that giving coverage to this topic from this angle will probably trigger the "what are you trying to hide?" reflex. Like I said, I use my real name. My email address and cell phone number are right there in the middle of the colbyrussell.com landing page, which is more than you can say for most people. (I've mentioned before how weird it is that 25 years ago, you could look anyone up in the phonebook, but today having something like that available seems really intrusive.) Besides, not even the Keybase folks themselves buy the pitch; at this time, the most recent post to their company blog is the introduction of Keybase Exploding Messages. And Snapchat's initial popularity says something about how much the general public truly feels about the value of privacy, despite how often the "if you have nothing to hide…" argument shows up.

So in the case of Keybase, keep the social proofs and keep the convenient methods of addressing, but also keep all those proofs and identities unlinked. I don't need a profile. Just let me create the proofs, the same principle in play when I prove everywhere else online that I control the email I used to sign up, but it need not tie into anything larger than that single connection. Just let my client manage all the rest, locally.


Unacknowledged un-

Sometimes an adage is trotted out that goes roughly like this:

Welp, it's not perfect, but it's better than nothing!

And sometimes that's true. It's at least widely understood to be true, I think.

What I don't see mentioned, ever, is that sometimes "it's better than nothing" is really, really not true. In some cases, something is worse than nothing.

My argument:

Voids are useful, because when they exist you can point to them and trivially get people to acknowledge that they exist. There's something missing. A bad fix for a real problem, though, takes away the one good thing about a void.

For example, consider a fundraising group that (ostensibly) exists to work on a solution towards some cause—something widely accepted to be a real problem. Now consider if, since first conception, and in the years intervening, it's more or less provable that the group is not actually doing any work to those ends, or at least not doing very good work when measured against some rubric.

Briefly: we could say that the org is some measure of incompetent and/or ineffective.

The problem now is that our hypothetical organization's mere existence is sucking all the air out of the room and hampering anyone who might come along and actually change things.

That is, even though we can argue rationally that their activity is equivalent to a void, it's actually worse than a void, since—once again—you can point to voids and say, "Look, we really need to do something about this!", but it's harder to do that here. Say something about the underlying problem—the one that the org was meant to solve—and you'll get railroaded in the direction of the org.

So these phenomena are a sort of higher order void. They're equivalent with respect to their total lack of contribution to forward progress on the issue we care about, but then what they also do is disguise their existence and act like sinks, so not even the potential energy stored nearby never gets put to effective use.


Underdeveloped

Other stuff from January that requires coverage here, but doesn't exist in longform:

Mozilla and feedback loops

by Colby Russell. 2018 October 11.

My "coming of age" story as a programmer is one where Mozilla played a big part and came at a time before the sort of neo-OSS era that GitHub ushered in. It's been a little over 5 years, though, since I decided to wrap things up in my involvement and called it quits on a Mozilla-oriented future for various reasons.

More recently—but still some time ago, compared to now—in a conversation about what was wrong with the then-current state of Mozilla, I wrote out a response with my thoughts but ultimately never sent it. Instead, it lingered in my drafts. I'm posting it here now both because I was reminded of it a few weeks ago from a very unsatisfying exchange with a developer still at Mozilla when a post from his blog came across my radar, and because, as I say below, it contains a useful elaboration on a general phenomenon not specific to Mozilla, and I find it worthwhile to publish. I have edited it from the original.

It should also be noted that the message ends on a somewhat anti-cynical note, with the implication of a possibility left open for a brighter future, but the reality is that the things that have gone on under the Mozilla banner since then amount to a sort of gross shitshow—the kind of thing jwz would call "brand necrophilia". So whatever residual hope I had five years ago, or at the time I first tried to write this, is now fairly far past gone, and the positivity sounds a little misplaced. Nonetheless, here it is.


in-reply-to: [REDACTED]

Developer's Lazyweb

by Colby Russell. 2018 January 24.

Given the churn induced by social coding sites like GitHub, we need a place to consult whose purpose is to stem the NIH tide. Like the opposite of a real life Lazyweb, the intent of posts are not desperate, hail mary requests to be spoonfed solutions; the implication is instead, "Hey, I'm very definitely about to go off and implement this unless someone speaks up. So if it already exists, let me know so that I don't end up creating something that the world didn't actually need any more of."

The contributor's dilemma, or the patch paradox

by Colby Russell. 2017 August 6.

You know from history that open source has always been shaped far more by the people who showed up with a working implementation compared to writing a comment that says, "I think we should do it like this". This is the "patches speak louder than words" school of thought.

At the same time, you know it's a good idea to confirm beforehand that there's an acknowledgement from upstream of the problem and an agreement about the general approach for the solution, so you don't waste your time.

GOTO 10

Nobody wants to work on infrastructure

by Colby Russell. 2017 June 14.

I read a piece once from someone on the theme of "things I know, but that no one else seems to". Briefly, here's one from me:

Nobody wants to work on infrastructure. This means that if you get an infusion of cash that leaves you with a source of funding for your project, and if you have any aspirations at all of attracting a community of volunteers—that is, people who will put in work to help out, despite having no obligation to you or your project—then the absolute first thing you should start throwing money at is making sure all the boring stuff that on one wants to work on is taken care of.*

Not realizing that you need to remove the roadblocks that prevent you from scaling up the number of unpaid contributors and contributions is like finding a genie and not checking to see if your first wish could be for more wishes.

This is a topic that would benefit from case studies. Examples (of projects that get this wrong) aren't scarce, but I'll save that writeup for another time.

*Note that "boring stuff" includes not just building and keeping things running, but also the boring job of continuously casting a critical eye at the contribution process itself to figure out what those things even are.

Novel ideas for programming language design

by Colby Russell. 2017 February 16.

Short variable names prohibited by grammar

Naming things using a single letter is consistently identified as a bad practice, and is even acknowledged as such by those who admit to sometimes "slipping up" and doing it themselves. So why not solve this by eliminating single-letter names in the grammar altogether?

Many languages adopt a rule that says, roughly, "identifiers must start with a letter which can be followed by one or more letters and digits". (Some allow for special characters like _ and $, too.) Or, in EBNF:

ident = letter { letter | digit };

Initially, we might suggest changing the rule to "identifiers must start with a letter which must be followed by one or more letters, digits, or symbols", which means the minimum length for a valid identifier is 2. With two-letter identifiers, though, single-letter programmers will likely end up throwing in another consonant or tacking on an underscore, thereby satisfying the language's rules, but subverting their spirit. I think the tipping point is 3. With a minimum length of 3, the ridiculousness of trying to thwart the rules without actually increasing the readibilty of the code becomes apparent even to the stalwarts, which should result in few hold outs.

Considerations

Type-named objects

Consider the following snippet:

PROCEDURE PassFocus* (V: Viewer);
  VAR M: ControlMessage;
BEGIN
  M.id := defocus;
  FocusViewer.handle(M);
  FocusViewer := V;
END PassFocus;

(This is Oberon. It has flaws—annoying ones. Oberon is not my favorite language. I'm comfortable presenting the examples here in Oberon, however, because this snippet should be more or less understandable even to those who've never seen its syntax, and if I'm going to present any example, I'm going to do it in a dead language that no one really uses, so as not to play favorites and put undue focus on the one chosen.)

Note the use of the single-letter identifier V in the parameter list and the local variable M. Our V can be easily changed to viewer, and that would probably be the prescription in most code reviews where the initial naming would be seen as a problem. However, we're now running afoul of an awful lot of repetition, which is a frequent criticism of many languages with static type systems. It's often pointed out with classic Java for example that almost any time you do something, you end up repeating yourself, sometimes up to three times. E.g.:

FrobbedFoo frobbedFoo = new FrobbedFoo(bar);

This is why C#'s var keyword is seen as an improvement, and JVM languages have by now adopted similar constructs.

It's also said that naming things is one of the hardest things in CS. The line above raises other questions, too. For our frobbedFoo should we perhaps be giving the local variable another name that describes it as something else? We're obviously dealing with a FrobbedFoo, and it is redundant to refer to it as such, so should we prefer to name it after its purpose in this context, i.e., what its role is in the procedure, rather than what kind of thing it is?

With type-named objects, we answer this hand-wringing by acknowledging that in many cases, the type alone is sufficient—not merely sufficient for the machine, but for the human reader, too. In languages with support for type-named objects, we therefore need not always give an object an explicit name. Instead we unambiguously refer to it in the local context using its type.

For example, one approach to designing a language with type-named objects would be to disambiguate with keyword the. The example above becomes:

PROCEDURE PassFocus* (Viewer);
  VAR ControlMessage;
BEGIN
  (the ControlMessage).id := defocus;
  FocusViewer.handle(the ControlMessage);
  FocusViewer := the Viewer;
END PassFocus;

Compared to our single-letter identifiers in the preceding snippet, this results in more typing, but the programmer isn't pressed to stop and think of intermediate names to give to the two objects local to the procedure. This will allow for maintaining an uninterrupted train of thought, and despite the higher demand for "human IO", type-bound objects should be more productive and viewed as a programmer convenience.

Considerations

Inverted selectors

Many languages have a receiver.member selector syntax, to select slot member of receiver. This is used both to access fields of records/structs/objects and to reference functions or other procedures—i.e., methods. Here we discuss an "inverted" selector syntax, so that the receiver.member above can become member @ receiver. This on its own is probably no significant benefit, but consider it in the context of a subroutine, paired with language support for type-named objects:

PROCEDURE PassFocus* (Viewer);
  VAR ControlMessage;
BEGIN
  id @ the ControlMessage := defocus;
  FocusViewer.handle(the ControlMessage);
  FocusViewer := the Viewer;
END PassFocus;

This @-notation is generalizable. I've wondered before why I don't see many (any?) languages offer a "passive" form to refer to members.

If the culture of the language under discussion is one that involves an overall pursuit to avoid magic symbols (e.g., Python and Wirth languages like Pascal and Ada), then the keyword from might be used, viz.

id from the ControlMessage

Considerations

The from keyword, if not already present in the language grammar (for use in some other context), may be problematic—it's hard to add keywords to a language, because it can end up making code that worked in version n-1 suddenly invalid code (reserved word used as an identifier). Contrast this the suggestion regarding the for discriminating type-named objects—I expect use of the as an identifier in the wild to be rare. So in the case of from, a semantically similar word like of might be used in its place. Failing that then for, although it reads slightly awkwardly, wouldn't be a completely inappropriate choice, and it's likely to already be a reserved word. The language designers just need to be comfortable allowing it to appear in two constructs, each one in which it has a completely different meaning.

Feed

by Colby Russell. 2017 February 15.

Welp. Stuff happened in the last year. When I last wrote, I mentioned cleaning up drafts from my personal notes to be published. They're still all in my notes, and none are here.

I have concrete plans over the next two weeks to post specific writeups. There is now a content feed by request.

Schtickle

by Colby Russell. 2016 February 18.

The pages here are now generated with schtickle, a static site generator written using JS with TypeScript.

Until last month, these pages were generated by Jekyll, but since I'm not a rubyist, I was never overwhelmed with excitement about the dependency on that ecosystem.

So when I found out about Marijn Haverbeke's Heckle, it made me happy. That the whole thing lived within a couple hundred lines, more or less, made me even happier. But there were a few issues Heckle had in dealing with my existing simple Jekyll-style site that prevented me from switching over, even after converting the templates to use Mold. They were easy enough to fix, but I had already decided I wanted to start making more use of TypeScript. Heckle's simplicity meant that something similar in scope would be a good candidate, so I wrote schtickle as a clone in TypeScript.

Schtickle is so heavily inspired by Heckle that when it came time to take care of the first order of business—outlining its data structures and function interfaces—I essentially just cribbed Heckle's design, which you can see from schtickle's initial commit. When fleshing out schtickle's implementation to achieve acceptable parity with Jekyll and Heckle, I made sure the problems I had were fixed in schtickle until it was working well enough for my use. And the codebases of each are so simple that, even though I'm not using Heckle, it was straightforward enough to go ahead and provide similar fixes for it, too.

(The amount of time between the first fix and the last is actually a matter of months—when I realized my templates weren't going to work in Heckle, I put the whole thing on the back burner last summer to deal with more pressing matters. When I began thinking about adding some new posts here last month, I picked schtickle back up from where I had started, finished filling it out for my needs, and finally transitioned away from Jekyll completely.)

I've got several of those generic, unbranded spiral notebooks. About one and a half are filled with entries spanning the last three years, and a third or so of those entries have content that's suitable for publishing here. Now that content will probably start getting revised and begin showing up.

Keeping a low profile on GitHub and staying active

by Colby Russell. 2016 February 13.

There's more than one reason that I try to avoid GitHub. This post is about one of them.

Avoiding GitHub can be difficult, because it seems like almost everybody is using it. Fortunately, there are enough projects that don't include "not having a GitHub account" as a barrier to entry that if you're just looking for somewhere to participate, then you've got choices. Bonus points: in the world of open source GitHub is relatively new in the grand scheme of things, and many of the aforementioned projects that don't revolve around GitHub are that way because they predate it. So if you're contributing to one of them, it's likely that your contributions are going towards something that has shown it has staying power.

Unfortunately, there are times when you're not "looking for somewhere to participate", but instead "looking to fix something in project X"―and it turns out that project X is on GitHub.

So even though I'd like to avoid it, I still frequently find myself needing to use GitHub in order to participate. Really frequently. As in, like, daily.

The especially problematic thing here is that one of the biggest issues I have with GitHub is how it doesn't give you a choice about whether you want to opt in to the social network side of the site. If you have an account and you're participating in a project in any way through github.com, you're part of its social network. In fact, even if you have an account but never, ever use it to log in or touch anything, you're usually still part of the GitHub social network because your commits are probably getting linked to your account through your email address.

Since there is no way to select a GitHub-without-the-social-network "plan" when creating an account, I've adopted a set of routines to approximate it. Here are some things that anyone can do to keep a low profile on GitHub while staying active and contributing to projects hosted there:

As I mentioned, these are all a part of the routine that I end up practicing every day. You might make different choices. For example, I have used GitLab to host some publicly accessible forks because I see having some presence there as less problematic than what happens at GitHub.

As far as wildcard addresses go, ideally, every commit would be using a unique address, but I haven't done anything to automate that. As it happens, there is some address reuse among the commits I push out.

And I haven't had to do it up to this point, but if things get especially onerous, I would consider whipping somethingup using the GitHub API or a browser extension to help out with batching my activity.

With that all said, here are some things not to do when trying to maintain a low profile on GitHub:

Don't write a script to automate account resets. It may be tempting, especially if you find yourself doing it a lot. However, registering an account through "automated methods" is against the GitHub terms of service.

Don't just create one account for each contribution you plan to make through github.com, e.g., so that you don't have to worry about deleting them. Unless you're paying for all those accounts, this is also against the GitHub terms.

Reblogging "Open Source is not enough"

by Colby Russell. 2016 January 19.

I don't know Adam Spitz, but I know that a few years back he wrote an excellent post titled "Open Source is not enough", and my reaction to reading it was vigorous, excited agreement. That URL is dead now and isn't archived by the Wayback Machine, so I decided to preserve it here. (Turns out, it's the most recent post, and the text can be recovered by visiting the front page on the Wayback Machine, but I'm copying it here, anyway.)

Open Source is not enough

The Open Source movement is great, but it doesn’t go far enough.

When I first tried Smalltalk, one thing that really struck me about it was that not only was the source “open”, but it was right there in front of me. If I wanted to see the source code for one of the classes in the Smalltalk standard library, I didn’t have to go to the web and find the project’s source-code repository and download the code. I just clicked on the class’s name in the Class Browser, and there it was. Making changes or additions to the standard library was as easy as making changes to my own code – everything was right there in the Class Browser, and changes took effect immediately.

The Morphic user-interface system, originally created for Self and later ported to Squeak and then Lively Kernel, took things even further. With Morphic, I could right-click on anything I saw on the screen and ask to see the source code for it. If I pressed a button and it did something neat and I wanted to see how it worked, I could find out with just a few clicks. If I wanted to make a second button that did something similar, I just right-clicked the first button and said Duplicate.

Convenience matters. When I feel the Urge To Tinker, only rarely does it feel like a loud voice shouting in my brain with enough energy to propel me to find the website and download the source code and figure out how to find the part of the code that corresponds to the thing I’m looking at on the screen and make the change and restart the program and retrace my steps. Much more often it’s just a quiet voice mumbling, “Hey, it’d be kinda neat if…” and then I think, “Well, it’s Open Source, I guess I could go download the source code… but… meh, it’s so far out of my way, not worth it,” and the urge fizzles out. I think that a lot of potential human creativity is being wasted this way.

Adam Spitz.
"Open Source is not enough". 2011 May 05.

Git and its hub

by Colby Russell. 2015 May 31.

Historically, I've avoided GitHub. I'm one of those people that agrees with the position that you shold be conscious of the risks you run with monocultures, plus I just don't think GitHub is actually all that great. I do make concessions, of course. Skip to the bottom if you just want details about my current revision control habits.

Forewarning: I don't think I'm about to say anything that hasn't been said before. I'm writing only because it occurs to me that if someone were to say, "I've tried to avoid using GitHub", then it's entirely possible that there exist people who haven't thought much about it and would have no idea why someone would take a stance like that.

One problem with GitHub is Git itself. See, this isn't limited to GitHub; I've also avoided Git where possible. When comparing the problems of monoculture around Git and a monoculture around GitHub, lots of the problems go away—GitHub is a centralized service, and Git is not—but some of them remain arguably relevant. One is the competition argument. That is, you don't want to encourage a scenario where something, whether it be a product or a service, has no competition, because competition leads to good things, and lack of competition is thought not to drive improvements, at least not as effectively. This may not be a terribly convincing argument in the world of revision control systems, and I'm not sure that I totally agree with it myself. The fact that a near-monoculture oriented around GitHub is capable of advancing something that's approaching a monoculture around Git itself may be proof of impotence in the competition argument: in adoption and usage, Git is pretty much trouncing Mercurial. Indeed, the benefits of an industry mostly unified around one system, particularly when the system is an open one like Git, very well may outweigh any advantages that competition brings.

Git users always point out how Git is so much more powerful than Mercurial. Recent versions of Mercurial are supposed to have made many of these comparisons obsolete, but ignoring this, I would still accept that the Git advocates are right, but here's the kicker: Even so, Mercurial is still a better system. It all comes down to usability.

Here's a thing that happens frequently: someone mentions that they find Git confusing, and someone comes along to share a link that's supposed to explain the concepts behind Git. "I found Git confusing, too, until I understood it conceptually", they say. The resource they link to is almost always trying to nudge the reader away from a CVS/SVN mindset. Here's the thing: I already understand Git on a conceptual level. I understand the underpinnings of DVCSs. And as a matter of fact, it's not that I've got an SVN background clouding my thinking, because I don't. (Funnily enough, I always avoided SVN for exactly the reasons Linus gave for avoiding it.) So you can stop trying to sell us on the idea of a DVCS workflow. I understand the concepts. If I ever say anything that sounds like I'm saying Git is confusing in some way, it means exactly one thing: I'm coming at this with an exasperation for the way fundamental Git concepts map onto its infuriatingly obtuse UI.

Here's another thing that happens: someone gives an example of confusing Git output and/or documentation, then someone else comes along to say, "It's like this. Simple." I suspect there's something else at play--and this touches on a broader social theory that I've been working at the back of my mind for a while. The idea is that familiarity smooths over any rough spots. It goes like this: there's this terrain with all these cracks on the ground liable to trip a person up, then there's this thing called "familiarity" which some people are able to use, and it oozes forth in the path ahead of them, filling in the cracks and smoothing the rough patches, like those appetizing visuals that you always see in ads for facial creams. The result is that the rough spots for them become a total non-issue. But it's a little more subtle than that, because if you were to ask them about all the rough spots, they'd tell you that they don't know what you're talking about and can't even see any. And they'd be right.

There's a reason, though, why these two things exist:

My claim is that Git's UI and its documentation suffer from a particular problem, which is that of being an artifact created by those who already understand what's going on. That's not the entirety of it, because all documentation is written like that. It has to be. But if you ask someone to document something there are two things that can result: docs that are understandable to both the experts and those unfamiliar, and then docs that explain in perfectly clear language only to those already familiar while being otherwise completely baffling to anyone else.

(I guess there's a third possible result, too, which would be categorized as "just unadulterated crap", but I was trying to focus on the sublety of the other two here.)

So my claim is really that Git's documentation and UI tends to be of the second type.

Mercurial is plagued in some ways by this, too, I'm sure. In fact, if I think back, I very definitely remember instances where I encountered pitfalls due to Mercurial's UI, but I'd be unable to tell you now exactly what they were. So Mercurial suffers from it, too, absolutely. It's just that it suffers from it a lot less.

There may be a good reason for this. Mercurial is just a lot simpler, by which I mean it has a less featureful core. In contrast to Git, with Mercurial you only pay for the features you use.

A few years back, when GitHub really began taking off, I remember pushing for Git within my team for our capstone project and for my team in another course that I was taking concurrently, when the other option on the table was to use no revision control at all. Mozilla had just settled on Mercurial a couple years before, back when it wasn't clear it was going to lose. My rationale at the time was, "Hey, I've got Mercurial covered, and I'm seeing more projects using GitHub everyday. Let's get on that." Bad idea. The index was baffling. Not just for me, but for everyone. I think by pushing for Git, I may have inflicted on my teammates a wholesale fear of revision control outright, and I know it wouldn't have been a problem if my suggestion had been to use Mercurial instead.

Some people love Git's staging area. It comes up all the time. They think it's great. They couldn't work without it. Here's where we see the difference in approach for Mercurial and Git. With Git, you have to pay the cost of interacting with the staging area whether you want it or not. In Mercurial, this would exist as an extension that provides that extra layer of indirection only if you enable it. And it does exist, in the record extension. I think. I wouldn't know. I see the staging area as a completely pointless level of indirection and have no use for trying to emulate it in Mercurial.

Now, on to GitHub itself.

For starters, it's Git-only, so everything above concerning Git simultaneously affects GitHub. Then there's the issue of GitHub, as a product itself, leaving something to be desired, and that something can usually be found elsewhere. GitHub's issue tracking is a good example.

GitHub's issue tracking is more or less a capable bugtracker as far as toy bugtrackers go. Bugzilla is a good example of a tracker fit for heavy-duty workloads. Let's look at an example. If you file a bug against https://github.com/example/repo, it creates an issue that's bound to that repo for eternity. If that organization has a related repo, say https://github.com/example/otherrepo, then you're out of luck if the bug triage process reveals that it should have actually been filed against "otherrepo" instead. (Assume both "repo" and "otherrepo" are distinct components used within one product; it's conceivable the reporter would make a mistake identifying in which of the two that the problem actually lies.) The best course of action for you if this happens—the best—is to close the original issue filed against "repo" and then open up a new one for "otherrepo". Any discussion, et cetera, is completely wiped clean in the new bug, and readers have to manually cross reference the original issue. Or you can leave it open at its original site and ignore the problems that thrusts upon you, namely one of poor organization in the places where you're trying to do work.

Bugzilla, on the other hand, is meant to run as a single instance to manage all of a project's bugs, no matter where the bug lies. It has the notion of "products" and "components". You can approximate the latter with labels in GitHub, but the leaks start to become visible when you try to approximate both at the same time. Bugzilla also has the concept of bug status down pat. This isn't just about lifecycle, which you can ignore if you like, but also about bug resolutions. In GitHub, your bug is either opened or closed. Again, you can approximate both Bugzilla's bug life cyle and its resolution type with labels, but by now you've fallen back to labels for all these things, and they're all just floating around in one big soup. Want to mark a bug as the equivalent of both FIXED and WONTFIX? Go ahead, they're just labels. What does it mean? Who cares, I guess.

And then there are all sorts of problems with the way GitHub handles code reviews. The fact that GitHub has comments that are specifically meant to be in response to a pull request is a good thing. That the pull request and the issue it's meant to fix are presented as these totally isolated things is a very bad thing. Gijs specifically calls this out in the comments to Gregory Szorc's post "Please Stop Using MQ":

github is terrible about filing a bug first and then creating a patch, because you are forced to have two issues in its tracker (you file an issue first, and your pull request will create another one), which means discussion about approach etc. gets split between the "issue" and the "pull request".

Again, the fact that comments concerning a particular pull request are organized in a way that it reflects that relationship? That's a really good thing. But what GitHub should do is aggregate all discussion into the page for the issue itself. Yes, even when there are multiple pull requests for the issue. In fact, especially when there are multiple pull requests for the issue. E.g., someone creates a pull request, the maintainer indicates they'd like more work done in some area before integrating the changes, and the requestor creates another pull request after making the changes to address those concerns. Now we have three threads of discussion, or rather, one discussion spread out amongst three pages. Bugzilla handles this by simply allowing you to mark older patches as obsolete. The patch/fork distinction deserves some comment, too.

Forks are dumb. The ability to fork is an incredibly valuable one, but forks themselves are total overkill for anybody just looking to submit a patch, which is the use case for the vast majority of contributors by an it's-not-even-close margin. Gijs nails it again. As he writes, "jquery has been forked over 7000 times at the time I'm writing this comment. The only version of jquery that's actually used [...] is under the jquery project's authority in github".

The thing about forks is not just that they're these conceptually heavyweight things that feel wrong. There's actually measurable friction involved with using them; the fork-and-PR workflow is heavyweight. "Doing things the github way takes forever", Gijs writes. When comparing it to patch submission: "[doing a patch] is a 3 step process: write code, do a diff, upload the result."

With forks, there's also a weird thing that happens. Go fork a project and then browse the repo on the Web as if you're someone else. I.e., you're unfamiliar with both whoever you are and with the project itself. Take a look at its README. If the original author wasn't careful, it now reads as if it's your project and a casual observer might mistake it for the canonical repo. This is a minor detail, but it weirds me out. I go through some effort to make sure I change the project's description to make it clear that it's just a fork of the proper project. "But wait", you might say. "If you fork a project, GitHub says that it's a fork and even links to the original project." Yeah, that's right. If you fork from within the GitHub UI. If you just create a new project on GitHub and add it as a new remote and push to it, you don't get such a warning. "So just use the GitHub UI to fork it, then." Nope. That's not possible if the original project isn't hosted on GitHub. If the original project's Git instance is self-hosted, creating a new project on GitHub and pushing to it is the only way to do it if you want your fork hosted there, and GitHub doesn't show anything in the UI to indicate that. In fact, it doesn't even show the forked-from UI if you use this workflow and the original project is hosted on GitHub. It just doesn't do that sort of detection.

In addition to manually changing the description to reflect that it is, in fact a temporal "fork", I also make sure to only keep my fork around as long as it takes to integrate my changes. I'm aggressive with pruning forks, which is something that seems to be rare elsewhere on GitHub. The result is similar to before: you click on someone's profile and listed in their repositories are all of these non-forks that were only ever created because they wanted to contribute a patch once, or maybe every now and then. "Every now and then" may have something to do with their keeping the fork around. See, if you go fix something and the pull request gets accepted, and then you prune the fork like I do, if two weeks or two months later you want to fix something else, then you've got to go recreate that project again before submitting another pull request. So it's not even as if the "leave the fork around" mentality can be attributed to unforgivable laziness. It's that the whole forking workflow is working against you to do otherwise.

I've pretty much blown way more of my time on blogging than I originally allotted for this, and I didn't even get to the part about how Git logs are totally unreliable. (Example: I once made a trivial change to this file. See if it can be found in the file's change history. Spoiler alert: it can't.) I'm also getting a little bummed about how negative I'm coming off here, although I suppose that's just the nature, given the topic I set out to tackle upfront.

So I'll stop now and leave you with a rundown of how I currently operate these days: I first reach for Mercurial, especially for clean-slate repos that are never going to be seen by other eyes, since I don't have to worry about how potential contributors may be uncomfortable with something that isn't Git. When I do use Git, it's always as a result of an existing project that has chosen Git for its revision control, but I still opt to refrain from hosting on GitHub, and the only time I use its features is when the original project is hosted there. My Git remotes point to GitLab, because yay for heterogeneity. The free private repos and the fact that GitLab has a (FOSS) "Community Edition" both go a long way towards helping inform that decision.

RFC 2616, you so silly

by Colby Russell. 2014 March 21.

If the message uses the media type "multipart/byteranges", and the ransfer-length is not otherwise specified, then this self- elimiting media type defines the transfer-length. This media type UST NOT be used unless the sender knows that the recipient can arse it; the presence in a request of a Range header with ultiple byte- range specifiers from a 1.1 client implies that the lient can parse multipart/byteranges responses.

Fielding, et al. RFC 2616 - Hypertext Transfer Protocol, Section 4.4.4: Message Length, p 33. IETF. 1999. (Accessed 2014 March 21).

"[…] unless the […] recipient can arse it". That's not even wrong, really.

On means becoming the ends

by Colby Russell. 2014 February 27.

The original reason to start the project which I had―which was the Germans were a danger―started me off on a process of action, which was to try to develop […] the system in Princeton then at Los Alamos to try to make the bomb work. […] With any project like that, you continue to work trying to get success, having decided to do it. But what I did―immorally, I would say―was not to remember the reason that I said I was doing it. So that when the reason changed, which was that Germany was defeated, not the singlest thought came to my mind at all about that, that that meant now that I have to reconsider why I'm continuing to do this. I simply didn't think, okay?
Richard Feynman.
"The Pleasure of Finding Things Out". Horizon. BBC, 1981.

See also: Beware anti-success.

Misheard disappointments

by Colby Russell. 2013 August 29.

Sometimes I mishear lyrics and, after finding out the correct ones, I'm disappointed. For example, in "The Day I Tried To Live", I thought it was:

The day I tried to win,
I dangled from the power lines
And left them all astretch.

It turns out, the last line is actually "[…] let the martyrs stretch", and "astretch" isn't even a real word in modern English. I still like the imagined line better.

"The results are in, and we suck!"

by Colby Russell. 2013 August 29.

You know when you see an ad, and it says that a poll has found the advertiser's product to rank ahead of all other competitors'? Isn't that kind of coincidental? Where are all the ads from the competitors who came in behind, sharing their rankings? It seems to be falling through the cracks.

(Edited from the original on 2014 April 09.)

Archives