As I mentioned in update 6, I’ve been spending the last few weeks doing some much-needed refactoring. Here’s an update on the progress:
It’s become more clear that Elm isn’t working out very well as a tech choice for the Unison editor. I’ve started to consider use of Elm a placeholder until I decide on a replacement (or until Elm improves to where it is usable for my purposes). Maybe I’ll do a longer experience report later, but to summarize:
Monad, or any abstraction or data type that has a type parameter whose kind is not
*. Also no higher-rank types or existential types. You can sometimes survive without these things, but eventually you reach a point where you need them. I’ve hit that point.
foldpfunction) or by sending values to a sink and have them magically appear elsewhere in your program. The oddly popular Elm architecture is just a pattern for building up your entire app’s interactivity as the arguments to a left fold over the merged input events of your app! Because the signal world is so limited, most of your logic necessarily lives elsewhere. Not necessarily a bad thing, but the result has been that I haven’t gotten much mileage out of Elm’s version of FRP. Instead I’m doing the vast majority of the work with pure code that could easily be written in just about any other functional language.
Elementstatic layout library, which I need for some of the things I’m doing. But it seems like a reasonable path forward might be to just write such a library or port Elm’s to whatever tech I end up using.
This got me thinking about the concept of usability of various tech tools like programming languages.
Here is a question: is a three-note keyboard more usable than a piano or a cello? On the one hand, there’s less to learn; on the other hand, if a piece of music calls for a piano or cello, a three-note keyboard is not going to be usable at all!
At the same time, let’s consider the cello. Perhaps we could lessen the learning curve of cello by adding frets… but this comes at a cost of limiting vibrato, which is part of what makes the cello (or any bowed instrument) sound so beautiful! All right then, how about at least adding visual markers where the frets would be? That can only help, right? Not necessarily. Markers might lead learners to rely on visual cues, rather than (more rapid, accurate, scalable) use of the ear and muscle memory… but on the other hand, if a cello learner is temporarily aided by use of visual markers, and this helps them to persist in learning until they no longer need them, who can say that’s a bad thing?
As a further subtlety, there’s something of a virtuoso culture around instruments like piano and cello that have been around for a long time. The virtuoso culture prizes musicians not just (or even primarily) for their sensitive or thoughtful expression of music, it also emphasize pure technical mastery of the instrument. And this same culture values music not just (or even mostly) for its beauty, but also for how much the music facilitates flaunting of technical mastery. If we’re being honest, we must admit that these cultural elements have some impact on who chooses to learn music, and who chooses to stick with this learning.
The point is, these issues are complicated, and there aren’t really easy answers. And that’s part of why debates about these things in the tech world never seem to go anywhere. But I’d like to offer a helpful way of thinking about usability that’s analogous to some of the ideas I posted about technical debt:
… consider the choice between receiving $500 right now or a 60% chance of $2000 a year from now. How about a million dollars now vs a 60% chance of 3 million dollars a year from now? Of course, these choices have different expected values, but also different levels of risk. As in modern portfolio theory, there is no concept of the optimal portfolio, only the optimal portfolio for a given level of risk.
When it comes to usability, there is no such thing as a tool which is optimally usable, we can only talk about optimal usability with respect to a level of expressiveness. That is, we can only make a given tool more usable by decreasing the amount of work it takes to accomplish the same thing, not by restricting capabilities. If we change the capabilities of the tool and make it more or less expressive, usability comparisons become meaningless. We are comparing apples to oranges. Neither artifact dominates the other, and it comes down to other preferences.
The reason the more expressive tool doesn’t strictly dominate the less expressive one is subtle. Yes, a tool which can do less (is inexpressive) requires less learning, and a tool with more capabilities (more expressive) requires more learning. We sometimes think of this extra learning and work as only being necessary if you happen to be doing something that requires the extra capabilities. But that’s not true. This holds even for simple tasks that can in principle be addressed by either tool. With the more expressive artifact, the user has to do work to figure out what subset of its capabilities should even be used, and how they should be used in concert to achieve the desired effect. Sometimes this amount of work is nontrivial, and it requires experience to do well. Choosing among several possible ways of doing something (some of which may not work out well at all) requires understanding the tradeoffs of these approaches. And this decision-making isn’t a one-time event, it’s a continuous process, occurring at multiple levels of granularity. We might say the user has a greater burden of choice.
With the less expressive artifact, there are fewer options and the decision of how to do something is often made for you, by someone who has some expertise and has tailored the defaults and limitations of their tool accordingly.
My point is that neither option dominates the other, it depends on many factors, including one’s experience and the time horizons of investment in using the artifact. Here are just a few examples:
Now then. What are the implications of all this? Well, it means that there is tremendous value in finding ways to decrease the burden of choice when using more expressive technology. Here are some ways of doing that:
Here’s a common situation: you realize you need to make some changes to a data type used all over the place in your codebase. How do you go about doing it?
In the trivial case: It’s something as simple as a implementation change (but no change to any types), or a renaming or other transformation that can be handled via a find/replace or even the (rather limited) automated refactoring capabilities of an IDE.
In the somewhat less trivial case: You make the change you want, then go fix all the compile errors. Hopefully there aren’t too many, and if you’re making good use of static types, you can have quite a bit of confidence that once you’re done fixing the errors, the new codebase will still work. For many codebase transformations, this works perfectly fine, even if it is a bit tedious and mechanical. More on that later.
In the nontrivial case: For many interesting cases of codebase transformations, simply making the change and fixing the errors doesn’t scale. You have to deal with an overwhelming list of errors, many of which are misleading, and the codebase ends up living in a non-compiling state for long periods of time. You begin to feel adrift in a sea of errors. Sometimes you’ll make a change, and the error count goes down. Other times, you’ll make a change, and it goes up. Hmm, I was relatively sure that was the right change, but maybe not… I’m going to just hope that was correct, and the compiler is getting a bit further now.
What’s happened? You’re in a state where you are not necessarily getting meaningful, accurate feedback from the compiler. That’s bad for two reasons. Without this feedback, you may be writing code that is making things worse, not better, building further on faulty assumptions. But more than the technical difficulties, working in this state is demoralizing, and it kills focus and productivity.
All right, so what do we do instead? Should we just avoid even considering any codebase transformations that are intractable with the “edit and fix errors” approach? No, that’s too conservative. Instead, we just have to avoid modifying our program in place. This lets us make absolutely any codebase transformation while keeping the codebase compiling at all times. Here’s a procedure, it’s quite simple:
Foo__2.hsand call the module inside it
Foo__2as well. Copy any over bits of code you want from
Foo.hs, then make the changes you want and get
Foo__2compiling. At this point, your codebase still compiles, but nothing is referencing the new definition of
Foo.hs. Let’s say
Bar__2.hsand call the module inside it
Bar__2as well. You can probably see where this is going. You are going to have
Bar__2depend on the newly created
Foo__2. You can start by copying over the existing
Bar.hs, but perhaps you want to copy over bits and pieces at a time and get them each to compile against
Foo__2. Or maybe you just copy all of
Bar.hsover at once and crank through the errors. Whatever makes it easiest for you, just get
Bar__2.hs, pick another module which depends on either
Barand follow the same procedure. Continue doing this until you’ve updated all the transitive dependents of
Foo. You might end up with a lot of
__2-suffixed copies of files, some of which might be quite similar to their old state, and some of which might be quite different. Perhaps some modules have been made obsolete or unnecessary. In any case, if you’ve updated all the transitive dependents of your initial change, you’re ready for the final step.
__2file, delete the original, and rename the
Foo.hs, and so on. Also do a recursive find/replace in the text of all files, replacing
__2with nothing. (Obviously, you don’t need to use
__2, any prefix or suffix that is unique and unused will do fine.)
Note: I’m not claiming this is a new idea. Programmers do something like this all the time for large changes.
Notice that at each step, you are only dealing with errors from at most a single module and you are never confronted with a massive list of errors, many of which might be misleading or covering up more errors. Progress on the refactoring is measured not by the number of errors (which might not be accurate anyway), but by the number of modules updated vs the total number of modules in the set of transitive dependents of the immediate change(s). For those who like burndown charts and that sort of thing, you may want to compute this set up front and track progress as a percentage accordingly.
What happens if we take this good idea to its logical conclusion is we end up with a model in which the codebase is represented as a purely functional data type. (In fact, the refactoring algorithm I gave above might remind you of how a functional data structure like a tree gets “modified”—we produce a new tree and the old tree sticks around, immutable, as long as we keep a reference to it.) So in this model, we never modify a definition in place, causing other code to break. When we modify some code, we are creating a new version, referenced by no one. It is up to us to then propagate that change to the transitive dependents of the old code.
This is the model adopted by Unison. All terms, types, and type declarations are uniquely identified by a nameless, content-based hash. In the editor, when you reference the symbol
identity, you immediately resolve that to some hash, and it is the hash, not the name, which is stored in the syntax tree. The hash will always and forever reference the same term. We can create new terms, perhaps even based on the old term, but these will have different content and hence different hashes. We can change the name associated with a hash, but that just affects how the term is displayed, not how it behaves! And if we call something else
identity (there’s no restriction of name uniqueness), all references continue to point to the previous definition. Refactoring is hence a purely functional transformation from one codebase to another.
Aside: One lovely consequence of this model is that incremental typechecking is trivial. Since definitions never change, we can cache the type associated with a hash, and just look it up when doing typechecking. Simple!
It’s a very pretty model, but it raises questions. We do sometimes want to make changes to some aspect of the codebase and its transitive dependents. Alright, so we aren’t going to literally modify the code in place, but we do still need to have a convenient way of creating a whole bunch of related new definitions, based in part of the old definitions. How do we do that?
What an interesting problem! I’ve talked to several people about it, and there’s also some interesting research on this sort of thing. Though I’m not sure what the exact form will take, it all seems very solvable. Just to sketch out some ideas:
[Int] -> [Int]to
[a] -> [a]) is also trivial. Just find all the places that reference the old hash, and point them to the new hash, transitively.
foo x y = blah (g 42) (x+x). You decide that rather than hardcoding
42you’d like to abstract over that value, so your definition becomes
foo x y z = blah (g z) ..This change now generates a set of obligations. The editor now walks you through the transitive dependents of
foo(out to whatever scope you want), and you have the option of either binding the additional parameter or propagating the parameter out further. Perhaps you do just want to bind 42 everywhere in the existing code, and the extra abstraction is just for some new code you’re about to write. There can be an easy way to do this kind of bulk acceptance. Or perhaps you have some way of conjuring up a number from other types that are usually in scope wherever
foois used. You write a function to do this once, include it as part of the session, and then reuse it (with approval) in lots of places. Of course, the UX for this is all TBT, but the point is that it’s a very structured, guided activity, and there’s lots of opportunities to reuse work. How many times have you worked through a rather mechanical refactoring, doing essentially the same thing over and over, with no real opportunities for reuse due to limitations of applying the refactoring via a process of text munging!
Such exciting possibilities! I look forward to exploring them further, hopefully with help from some of you! When I get through this latest refactoring, I’ll feel like the code is in pretty decent shape and will be releasing it publicly. I look forward to developing Unison more out in the open, in collaboration with other folks who as inspired by this project as I am.comments powered by Disqus