Update: A Word About The Title. If you came here from Twitter, the line I used - "not unsound, incomplete" - referred to the original title of the article "Incomplete by design," which was based on my misunderstanding of the dominant terminology in static analysis (see updates, and thanks David and Sam in the comments for pointing out my wrongness). When I realized this, I renamed the article "Deliberate decisions," but that wasn't an interesting title. When I read Rafaël Garcia-Suarez's take on Dart this morning, Why Dart is not the language of the future, I decided to support his proposal of calling types as a "lint-type development aid, not a language feature" feeble typing, and renamed the article again. You should read Rafaël's post and this one; it is possibly the most agreement you will ever find between a Perl blogger and a Carnegie Mellon University programming languages graduate student.
...anyway...
There's a lot of feedback, and a non-trivial amount of snark, going around the internet based on the release of Dart, a proposed Javascript-killer by Google. My primary experience with Javascript is that people tell me it's the worst compiler target language that is widely used as a compiler target language, so I basically have nothing invested in the language, but was interested by the discussions it brought up.
The snark about Dart has centered around the following line on Page 72 of the language spec, which I believe was pointed out to the Twitterverse by Debasish Ghosh.
The type system is unsound, due to the covariance of generic types. This is a deliberate choice (and undoubtedly controversial). Experience has shown that sound type rules for generics fly in the face of programmer intuition. It is easy for tools to provide a sound type analysis if they choose, which may be useful for tasks like refactoring.But what does it mean for a type system to be unsound? I really think that the most illustrative snippet about Dart types came on the following page of the language spec was not the one that Debasish retweeted, but one that came on the next page:
A Dart implementation must provide a static checker that detects and reports exactly those situations this specification identifies as static warnings. However:This, for me, clarified what was going on substantially. Let me tell you a parable.
- Running the static checker on a program P is not required for compiling and running P.
- Running the static checker on a program P must not prevent successful compilation of P nor may it prevent the execution of P, regardless of whether any static warnings occur
How to anger students in an undergraduate PL course
Step 1
\[\infer {{\tt if}~e~{\tt then}~e_1~{\tt else}~e_2 \mapsto {\tt if}~e'~{\tt then}~e_1~{\tt else}~e_2 } {e \mapsto e'} \] \[\infer {{\tt if}~{\tt true}~{\tt then}~e_1~{\tt else}~e_2 \mapsto e_1} {} \qquad \infer {{\tt if}~{\tt false}~{\tt then}~e_1~{\tt else}~e_2 \mapsto e_2} {} \] \[ \infer {e_1 + e_2 \mapsto e_1' + e_2} {e_1 \mapsto e_1'} \qquad \infer {{\tt num}(n_1) + e_2 \mapsto {\tt num}(n_1) + e_2'} {e_2 \mapsto e_2'} \] \[ \infer {{\tt num}(n_1) + {\tt num}(n_2) \mapsto {\tt num}(n_1 + n_2)} {} \]
The implementation was to be an ML function step with type expr -> expr option, and the specification was that step e = SOME e' if there existed an e' such that e \(\mapsto\) e', and that step e = NONE otherwise (for instance, \(\tt true\) obviously can't take a step according to these rules).
Step 2
Describe how they can type-checking the language, by defining a type system like this. Have them implement this type checker as an ML function with type expr -> typ option, same idea.
\[ \infer {{\tt true} : {\tt bool}} {} \qquad \infer {{\tt false} : {\tt bool}} {} \qquad \infer {{\tt if}~e~{\tt then}~e_1~{\tt else}~e_2 : \tau} {e : {\tt bool} & e_1 : \tau & e_2 : \tau} \] \[ \infer {{\tt num}(n) : {\tt number}} {} \qquad \infer {e_1 + e_2 : {\tt number}} {e_1 : {\tt number} & e_2 : {\tt number}} \]
Step 3
Have students prove the theorem that this type system does something. The theorem statement goes as follows, and the proof is by the by-now standard technique of safety-via-progress-and-preservation.
Theorem (Safety): If \(e : \tau\) and \(e \mapsto \ldots \mapsto e'\), then \(e' : \tau\) and also there either exists some \(e''\) such that \(e' \mapsto e''\) or else \(e'\) is of the form \(\tt true\), \(\tt false\), or \({\tt num}(n)\).
Step 4
Test their ML code from Step 1 on expressions like if 4 then true else 9, breaking many of the students implementations of the dynamic semantics which were prepared only to handle well-typed inputs.
Analysis: is this fair?
Think about the perspective of the student who complained about the fact that their interpreter either crashed (or maybe returned SOME(Num 9)!) after being handed if 4 then true else 9. On one hand, they clearly violated the spirit of the assignment: Step 1 was a perfectly well-defined assignment all on its own, and they didn't fulfill the specification of that particular assignment. But on the other hand, they proved the theorem in Step 3, and perhaps feel as if they should get something out of the fact that they proved that theorem: the ability to only have to reason about the behavior of well-typed programs: why should it be surprising that garbage-in produced garbage-out?
Compiler writers actually get to think like that; indeed it's almost essential that they be allowed to. On a 32-bit machine, most values are compiled to plain ol' 32-bit words, and so the representation of \(\tt false\) might have the same in-memory representation as, say, \({\tt num}(0)\). Or it might have the same in-memory representation as \({\tt num}(1)\)! It doesn't matter, because, for the compiler writer, the safety theorem has already given a guarantee that the language has canonical forms - that if \(e : {\tt bool}\) and \(e\) eventually steps to an irreducible expression, then that irreducible expression must either be \(\tt true\) or \(\tt false\).
This means that the compiler writer need not worry about how if 1 then true else 9 and if 0 then true else 9 might behave - they may raise an exception, return (the memory representation of) true, or return (the memory representation of) false. The only programs upon which the compiler promises to behave the same way as the language definition are those that pass the typechecker, and the type safety theorem is a critical component of that promise.
In this way of looking at the world, the representation independence given by a type system is really quite important, and it means that an unsound type system could cause very very bad things to happen: if you're allowed to mess around with the representation of, say, a pointer, by adding things to it, then you have introduced buffer overflow errors to a language, which would be pretty awful if you allowed code written in this language to execute in a privileged way in your browser. That (hopefully!) is not at all what Dart means when they mean their programming language is unsound.
Theorems versus bugs
I think the problem here is that, as a broad overgeneralization, there are two ways to look at what people are doing with types in the first place. On one hand, there is the view that types are a tool to provably preclude certain classes of errors - like the possibility that you might end up with the expresion if 1 then true else 9 which is "stuck" according to the defined operational semantics. On the other hand, there is the idea that types are a language feature that is aimed at helping document code and reduce the number of bugs in a program without necessarily precluding any particular errors. In the academic circles I travel in, a type system is understood as a technical definition about the former: if you can't prove a theorem about what kind of errors your type system precludes, then it is something else besides a type system. I think it's quite fair to both type systems and static analysis researchers to call the latter notion of types a simple form of static analysis.
[Updated] There's nothing wrong, per se, with such a static analysis, though I think it's fair to call it an unsound static analysis instead of an unsound type system. To use the language of Ciera Jaspan's recent thesis, start with a particular class of error (gets-stuck, raises-a-certain-exception, divides by zero, whatever). An analysis is sound if it never passes a program with a particular sort of error (permitting a safety theorem about that analysis!) and complete if it fails only programs that will actually manifest the error at runtime.1 A sound but incomplete analysis is called conservative; the type checkers of ML and Java represent such analyses. An analysis that is neither sound nor complete is called pragmatic by Jaspan, as there aren't any theorems to be proved about such an analysis: they can be justified only by their utility in practice.
I can certainly think of situations where I'd want a pragmatic analysis. In the past, I have had occasion to write Python, which I will admit I have a certain fondness for. However, I have also complained about how my complicated Python program ran for half an hour and then failed with some cast exception that, upon inspection of the code, was always, statically, evidently going to happen no matter what the complicated bits of the code did and why couldn't it have warned me that it was going to do that before running for half an hour. Even if I implemented an analysis to generate such a warning, Python is and would remain a (hopefully) safe, strongly-typed programming language with exactly one type - the type of tagged data that causes an exception if you try to use an object as an integer (or whatever). The static analysis is trying to prove a different kind of theorem - one that says "you have indicated that raising ClassCastException is undesirable, and here's a proof that your current program will raise ClassCastException". If the static analysis can't prove that theorem (thus demonstrating a bug), I'm no worse off than I was when I used Python without that static analysis. A type safety theorem, however, would have the form "if the program passes the typechecker, then ClassCastException will not be raised."
And with my current understanding, the "unsound type system" of Dart is just such a "pragmatic analysis" as described by Jaspan. I hope my examples explain what might still be wrong with such a language - if you can't static preclude certain classes of errors, you must either allow "unspecified behavior" (and that way lies buffer overruns and security violations) or else you must be able and willing to check, at execution time, for the occurrence of those errors, which is not efficient (and for some classes of errors may be impossible). You're back in the world of our hypothetical angry student: you've got to be able to handle all the ill-formed programs and obey the language definition on a wider class of programs.
[Updated] ...you could argue, of course, that you're no worse off than you were when you finished Step 1. On some level, you're certainly right; my hypothetical Python+tool-to-catch-a-couple-of-errors is better, in my humble opinion, than Python without (and this view has the virtue of honesty). If you want to make that argument, however, I encourage you to read Chung-chieh Shan's related blog post about covariant generics, which argues from a less practical-compiler-optimizations and more philosophical point of view that I also find quite convincing. The point he makes is that the traditional view of types is important because types should mean things - unless you want to choke on pencils!
Conclusion
In summary: type systems are useful because of type safety theorems: a type safety theorem means that certain things can't happen. One reason this is nice is because the complier writer, the programmer, and the language designer needn't worry about what happens in the case that such an impossible thing happens.
Types are also a form of documentation, and they're a regular form of documentation that a complier can then take, generating warnings or errors about certain classes of bugs without actually promising to preclude those bugs. A static analysis that uses type information to preclude some (but not all) errors of a particular type is probably better termed an "pragmatic type-based analysis" than an "unsound type system." Garcia-Suarez called it "feeble typing" and I renamed my post accordingly. It's a bit more pejorative than the tone I was originally trying to take in the article, but I like it anyway.
I think that's a fair way of looking at things, and it puts decidable type systems, on a practical level, as a member of a larger class of type based-static analyses.2 In a type system, we must be able to ask a question about what sort of theorem is proved about programs that are well-typed, and if there's no such theorem, then the analysis is still within the class of type-based static analyses, but isn't so much a type system. At the end of the day, of course, English has no central authority, so asking people to distinguish "type-based analyses" from "type systems" may be a fool's errand,3 but I think it's worthwhile to delineate the difference, and I don't think my delineation significantly departs from current usage (apart from "feeble typing" which was, I think, coined the day after I originally posted this).
Something I started off wanting to talk about before this post got too long was why it is the case that "sound type rules for generics fly in the face of programmer intuition" (which is pretty clearly, in my opinion, missing the addendum "in languages with subtyping"), because two of the reasons why I think this is the case are quite interesting on their own. One of them has to do with polynomial data types and persistent data, and the other has to do with base types and refinement types as explored by William Lovas in his thesis. Neither of these ideas are adequately represented in existing programming languages, though they are more-or-less theoretically understood at this point. Another day, perhaps.
Last word
I noticed that one of the principal language designers was quoted as follows
You have to understand that the types are interface types, not implementation types – so the runtime cannot use the type information for anything. The types are simply thrown away during compilation.That is to say, insofar as this article goes, I think I've only said factual things that the language designers would essentially agree with; in particular, they seem to recognize that their use of the word "types" seems bound to confuse (or troll) others. But calling a "feeble type" an interface types and a "type" an implementation type seems to just be making up words. And, as made-up-words go, I really dislike "interface types" as a neologism (certainly it has nothing to do with Java interface or anything that comes up when I search for "interface type"). The theorist's critique of Dart is precisely that things you call "type" should define inviolate interfaces and not mere suggestions that are neither enforced nor checked. Calling them interface types makes them sound like contracts, which are not thrown away by the compiler. "Suggestion type" might be a less pejorative version of "feeble type," perhaps? And "implementation types" is a terrible term to use to describe types in a a type system, types that (by way of a theorem about canonical forms) can be relied upon by both the programmer and the implementation.
1 Godefroid, Nori, Rajamani, and Tetal call the sound analyses may analyses (though it should perhaps be may not, as a sound analysis precludes a certain behavior) and call the complete analyses must analyses (the error must happen) in the paper "Compositional may-must analysis."
2 This view really isn't fair to the reason that we are interested in type systems and believe they're supposed to be useful, but that has to do with Curry-Howard and the unreasonable effectiveness of mathematics and with other things Shan talked about in his linked post.
3 Honestly, it probably won't help the fool's errand if I try to call the other group's type based analyses "feeble typing," but here I am, doing it anyway.