This content originally appeared on raganwald.com and was authored by Reginald Braithwaite
As programmers, it is our job to build software out of abstractions. Logic gates are connected to form a “Von Neumann Computer.” An assembler creates an interface that can be programmed with instructions like MOV 12345, 567890
. A compiler lets us write x = y
, and so on up to has_and_belongs_to_many :roles
.
When programming in a particular language, we often want to borrow an abstraction from another language, much as English speakers will murmur “C’est la vie” when the build breaks. In JavaScript, the Underscore library includes a function called pluck
. Compare _.pluck(users, 'lastName')
in JavaScript to using String#to_proc in Ruby: users.map(&:lastName)
. The mechanisms and syntaxes are different, but the underlying ideas are similar.
For small idioms, other languages can be a fertile source of abstractions. But we struggle when we become ambitious and attempt to Greenspun new semantics that are a poor fit with our primary tool.
In Ruby, for example, Benjamin Stein and I wrote a little thing called andand. It emulates the existential (or “elvis”) operator from Groovy and CoffeeScript. On the surface, it’s as simple as String#to_proc. You can write something like user.andand.lastName
, and if user
is null
, the expression evaluates to null
without any exceptions being thrown.
a leaky abstraction
But let’s draw the curtain back, shall we? The andand
method that’s mixed into all objects is not too tough to parse once you realize there’s a special case for passing a block:
def andand (p = nil)
if self
if block_given?
yield(self)
elsif p
p.to_proc.call(self)
else
self
end
else
if block_given? or p
self
else
MockReturningMe.new(self)
end
end
end
class MockReturningMe < BlankSlate
def initialize(me)
super()
@me = me
end
def method_missing(*args)
@me
end
end
But what is this MockReturningMe
thingummy? Well, if the receiver of .andand
is falsey, the method .andand
returns a special proxy object that returns the original object no matter what method you send it. This works just fine for the “normal case” of writing something like raganwald.andand.braythwayt
, but introduces icky1 edge cases.
In CoffeeScript, the compiler will complain if you write object?.
instead of object?.method
. But object.andand
is perfectly acceptable Ruby code that returns either the receiver or one of these proxy objects. All sorts of unpleasant bugs can arise from a simple mistake, bugs that can’t be caught in a dynamically typed language like Ruby.
As Joel Spolsky would say, “andand is a leaky abstraction.”
the blockhead programmer
Implementing a programming language is an incredibly valuable exercise. Some time ago I wrote a toy Scheme, one where everything was built up from unhygienic macros and just five special forms. let
isn’t one of those five, so I wrote a macro that rewrote
(let ((foo 1) (bar 2))
(+ foo bar))
into:
((lambda (foo bar)
(+ foo bar))
1 2)
If you’re somewhat familiar with JavaScript and Lisp, you’ll recognize the second expression as an Immediately Invoked Function Expression. The macro provides the illusion that let
defines and binds local variables in my toy Scheme the way var
does in JavaScript. But that isn’t what happens: In reality, parameters to lambdas are the only mechanism for defining variables.
It’s an interesting mechanism, and it has been borrowed for the CoffeeScript language’s do
keyword. JavaScript programmers are often tempted to use it to implement block scoping. In JavaScript, a new scope is only introduced by functions. Take this terrible code:
function whatDoesThisDo (n) {
result = '';
for (var i = 0; i < n; ++i ) {
if (i % 2 === 0) {
for (var i = 0; i < n; ++i ) {
result = result + 'x';
}
}
}
return result;
}
whatDoesThisDo(6)
//=> "xxxxxx"
It seems contrived for the purpose of hazing University graduates that interview for programming jobs. The key point for our purposes is that despite the var
declaration and the fact that for (var i = 0; i < result.length; ++i )
is nested inside of if (i % 2 === 0) { ... }
, the i
indexing the inner loop is the exact same i
as the one that indexes the outer loop, and that is going to produce problems.
Some languages have block scope: The introduction of a block like { ... }
introduces a new scope, and therefore you can create a new i
that shadows the original. This is possible in Scheme with the let
form, and if you have a taste for having one variable mean different things in different places, you can appear to create the same effect in JavaScript with an IIFE:
function whatDoesThisDo (n) {
result = '';
for (var i = 0; i < n; ++i ) {
if (i % 2 === 0) (function () {
for (var i = 0; i < n; ++i ) {
result = result + 'x';
}
})();
}
return result;
}
whatDoesThisDo(6)
//=> "xxxxxxxxxxxxxxxxxx"
By using (function () { ... })();
instead of plain { ... }
, we’re creating a new JavaScript scope. Passing rapidly over the performance implications of creating a new function only to execute it once and then throw it away, have we implemented block scope as you might find in a language like C#?
Almost.
more leaks
Once again, we have a leaky abstraction. The enthusiastic programmer, having rediscovered how to implement block scope in JavaScript with IIFEs, might decide that IIFEs are the new go-to idiom for block scoping:
function oddsAndEvens (n) {
var result;
if (n % 2 == 0) {
(function () {
for (var i = 0; i < n; ++i) {
result = result + 'even';
}
return result;
})()
}
else {
(function () {
for (var i = 0; i < n; ++i) {
result = result + 'odd';
}
return result;
})()
}
}
oddsAndEvens(4)
//=> undefined
The reasoning behind this code is beyond dubious, but the problem is apparent: The intention was that return result
return the result from the oddsAndEvens
function, but in reality it returns the result from the anonymous function enclosing it. In languages like Ruby and C++, blocks are lighter weight than lambdas precisely because the semantics of things like return
are different from within a block than from within a function body.
It doesn’t matter that let
is implemented with a lambda in our toy Scheme because our toy Scheme doesn’t have a return
form. But it matters greatly in JavaScript. Imagine, for example, that you use Esprima to write a preprocessor. You could lose all grip with reality and translate ES5 code like this:
let(x = 1, y = 2) do {
x + y;
} while(true);
into:
(function (x, y) {
return x + y;
})(1, 2);
Besides the vestigial while(true)
, this will break badly whenever someone tries to use a return
inside our pretend-let, just as we saw above. And it gets worse: What is the meaning of this
inside our so-called blocks?
Now, many of these problems can be fixed. You could write .call(this, 1, 2)
instead of (1, 2)
to preserve this
. You could even use try and catch to establish a new scope for a single variable, as let-er does. But once again, we find that when we try to bolt the features from one language onto another, we create a leaky abstraction that falls down for anything but the obvious cases.
are we doomed?
Now these examples are contrived, but they reproduce actual bugs I’ve encountered when writing code using similar mechanisms. And my experience so far is that the more complex the abstraction being borrowed form one language and Greenspunned onto another, the more annoying the abstraction leaks become. Don’t get me started on lazy collections in Ruby!
It’s natural to wonder if all such efforts should be dismissed as a bad idea. They can be unfamiliar to uniglot programmers, and if my contention is correct, they are fraught with edge cases and subtle bugs. Must we always try to “cut with the grain” and use a language’s “natural” idioms?
Personally, I embrace features from other languages. But recognizing that they have limitations, I avoid embracing them to the point that they become prevalent. Something like trampolining can be a very useful tool for surgically solving a particular problem in a language that lacks Tail Call Elimination, but trampolining all of a code base’s method calls would be an act of masochism.
And perhaps, we can learn from this that languages are limited. While it might seem like we can implement any feature from another language we like, the reality is that we can write code that looks like another language’s code, but under the hood it’s still our original language, and trouble awaits those who blindly embrace the abstraction.
If another language’s abstractions are so convenient, maybe we should consider switching rather than writing an ad hoc, informally-specified, bug-ridden, slow implementation of half of a better idea.
(discuss)
-
The quality of being as obscure as Ick, an obscure Ruby library. ↩
This content originally appeared on raganwald.com and was authored by Reginald Braithwaite
Reginald Braithwaite | Sciencx (2013-08-01T00:00:00+00:00) Leaky Greenspunned Abstractions. Retrieved from https://www.scien.cx/2013/08/01/leaky-greenspunned-abstractions/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.