Thursday, January 9, 2014

Life update, a TA blog

It's been over a year and a half since I last posted! In that time I've defended my dissertation and begun a teaching-track position at Carnegie Mellon. I'm mostly focused on teaching the freshman/sophomore level Principles of Imperative Computation course. This semester I will have a bit of a change of pace teaching Foundations of Programming Languages, a course designed by Bob Harper and based on his book Practical Foundations of Programming Languages.

Some time ago Ron Garcia asked me if I would be blogging about my teaching here. I'd certainly like to, but thus far I haven't made the time. However, I'm pleased that one of my excellent and recurring Principles of Imperative Computation TAs, Nivedita Chopra, plans to blog about her experiences this semester. I'm excited to read along myself!

Thursday, April 19, 2012

Charity and trust

A while back, I complained about the ACM's membership in the AAP, a publishing trade organization that supported some odious legislation like SOPA and the Research Works Act. The ACM, to their credit, responded to the concerns that the community at large raised (there were many others). ACM's Director of Group Publishing Scott Delman did so in this space, and ACM President Alain Chesnais did so on his blog. (Side note: in that post, and especially in the comments to it, Chesnais seemed to come really close to saying something like "The ACM does not and will not support the RWA, SOPA, or the AAP's lobbying, because the ACM doesn't take political stances in either way as a matter of policy" but he also seemed to take really great pains to not say anything that clear and specific, which was baffling to me.)

One of the arguments I used in that previous post was about Author-izer, the new ACM service that allows you to provide the official version of your papers for free from your personal webpage. The reason I brought that in to the argument is that Author-izer is basically a good idea insofar as the CS community trusts the ACM to act in their interests. I still support the argument that their membership in the AAP runs directly counter those interests and thereby necessarily weakens that trust. Certainly, though, life's complicated and everyone gets to decide as an individual if they want to be a member (and thereby a supporter) of the ACM despite their affiliation with the AAP, just as the ACM as an institution gets to decide if it (we!) want to be a member (and thereby a supporter) of the definitely-lobbying AAP despite having a no-lobbying policy.

Andrew Appel, at the Freedom to Tinker blog, has a series of three blog posts presenting four ways that academics can think about their relationship to copyright. I had fun reading them, and they provide an interesting alternative answer to why trusting the ACM to act in our interests is important. The model that Appel suggested that he subscribes to sees signing away one's valuable research papers as a significant charitable contribution. He's comfortable giving that charitable contribution both to ACM and IEEE, presumably because he broadly supports them and their roles, and because the cost to the world as a whole of the donation, relative to just making the donation truly free, is maybe not all that large. (I could note that reviewing papers also fits well within this model of charitable contribution.)

The idea that adopting this view is a reasonable compromise, as Appel suggests, seems pretty reasonable to me, while remaining sympathetic (and potentially responsive) to concerns "that ACM and IEEE have fallen behind the curve." I certainly try to donate to organizations I believe in and trust! I would therefore certainly pause before making a in-kind charitable contribution to a highly profitable Dutch multinational that spearheads legislation like the RWA and lends credibility to homeopathy, but that reflects my values and doesn't need to reflect yours.

Tuesday, April 3, 2012

Differently-higher-order-focusing

I was toying with an idea last weekend, but it passed the "this is taking too much of your time, go back to your thesis" threshold, so I'm going to jot down what I figured out here, link to some Agda code, and maybe return to it in the future.

I'm going to use intuitionistic multiplicative, additive linear logic (without the exponential \({!}A^-\)) in this note. Doing it this way is the easiest way to explain everything, though it does mean I'm assuming some familiarity with intuitionistic linear logic. I'm also not going to use shifts; the Agda formalizations do use shifts, but the difference isn't important.

In polarized linear logic without shifts, there's one big syntactic class \(A\) with all the propositions from linear logic, but they are sorted into positive and negative propositions based on their topmost connectives. Atomic propositions can be given arbitrary polarity \(p^+\) or \(p^-\) as long as that polarity is consistent.

  • \(A^+ ::= p^+ \mid {\bf 0} \mid A \oplus B \mid {\bf 1} \mid A \otimes B\)
  • \(A^- ::= p^- \mid \top \mid A ~\&~ B \mid A \multimap B\)

Quick review of higher-order focusing

Higher-order focusing as originally defined by Noam takes as the definition of connectives as given by a pattern judgment. Pattern judgments for positive types track the construction of terms; a reasonable way of reading \(\Delta \Vdash A^+\) is that it says "given all the stuff in \(\Delta\), I can construct a proof of \(A^+\)."

\[ \infer {p^+ \Vdash p^+} {} \qquad \infer {A^- \Vdash A^-} {} \]\[ \infer {\Delta \Vdash A \oplus B} {\Delta \Vdash A} \qquad \infer {\Delta \Vdash A \oplus B} {\Delta \Vdash B} \]\[ \infer {\cdot \Vdash {\bf 1}} {} \qquad \infer {\Delta_A, \Delta_B \Vdash A \otimes B} {\Delta_A \Vdash A & \Delta_B \Vdash B} \]

Similarly, the negative pattern judgment tracks how negative types are used. A reasonable reading of the pattern judgment \(\Delta \Vdash A^- > C\) is that it says "given all the stuff in \(\Delta\), I can use a proof of \(A^-\) to prove \(C\)."

\[ \infer {\cdot \Vdash p^- > p^-} {} \qquad \infer {\cdot \Vdash A^+ > A^+} {} \]\[ \infer {\Delta \Vdash A ~\&~ B > C} {\Delta \Vdash A > C} \qquad \infer {\Delta \Vdash A ~\&~ B > C} {\Delta \Vdash B > C} \qquad \infer {\Delta, \Delta' \Vdash A \multimap B > C } {\Delta \Vdash A & \Delta' \Vdash B > C} \]

One of the nicest things about higher-order focusing, in my opinion, is that it removes the question of what order the invertible left rules get applied in altogether: when we come to a point where invertible rules must be applied, the invertible rules happen all at once by logical reflection over the pattern judgment.1

\[ \infer {\Delta, [A^+] \vdash C} {\forall(\Delta' \Vdash A^+) \longrightarrow \Delta, \Delta' \vdash C} \qquad \infer {\Delta \vdash [A^-]} {\forall(\Delta' \Vdash A^- > C) \longrightarrow \Delta, \Delta' \vdash C} \]

In particular, this means that a derivation of \(\Delta, [p^+ \oplus (A^- \otimes q^+) \oplus {\bf 1}] \vdash C\) has three immediate premises corresponding to the three one of which is a derivation of \(\Delta, p^+ \vdash C\), one of which is a derivation of \(\Delta, A^-, q^+ \vdash C\), and one of which is a derivation of \(\Delta \vdash C\).

This "do all the inversion all at once" idea is a strengthening of the idea that appears in Andreoli's original presentation of focusing, in the definition of CLF, in Kaustuv Chaudhuri's focused linear logic, and in my work on structural focalization. All of these aforementioned presentations have a separate, ordered inversion context that forces you to decompose positive propositions into contexts and negative propositions in one arbitrary-but-fixed order, which kind of lets you pretend they all happen at once, since you can't make any choices at all until you're done. This isn't the only way to look at it, though. In Miller and Liang's LJF, in Frank Pfenning's course notes on Linear Logic, and in probably a bunch of other places, the inversion steps can happen in any order, though for a fully focused system you have to do get through all the possible inversion steps before you focus again. In order to get the strength of a Andreoli/Chaudhuri/me-focused systems in an LJF-like setting, one must prove a Church-Rosser-esque property that all ways of doing inversion are the same.

The reason I think the use of higher-order formulations is particularly nice is that it blows away the distinction between systems that fix the inversion order (like Structural Focalization and its many predecessores does) and systems that allow nondeterminism in the inversion order (like LJF and its kin). There's no question about the presence or absence of inessential nondeterminism when all the nondeterminism seriously happens in one rule that has as many premises as necessary to get the job done.

One of the less-awesome things about higher-order focusing is that it requires this extra substitution judgment, a complication that I won't go into here but that gets increasingly annoying and fiddly as we go to more and more interesting substructural logics, which are of particular interest to me. (This is not to say that it doesn't work, I wrote a note on higher-order focusing for ordered logic way back two years ago when Request For Logic was a series of Unicodey-notes instead of a blog.) To try to get the best of both worlds, Taus and Carsten's Focused Natural Deduction is the only system I know of that tries to use the pattern judgments just for negative introduction/positive elimination while using the familiar negative eliminations and positive introduction structures.

I want to play the same game Taus and Carsten play, but without using the pattern judgments at all. I think I have a better idea, but it ended up being more work than I have time for to actually prove that it's better (or to conclude that I'm wrong and it's not).

Alternate formulation

If \(P(\Delta)\) is a judgment on linear contexts \(\Delta\), then \(I^+_{A^+}(\Delta. P(\Delta))\) - which we can mathematically \(\eta\)-contract and write \(I^+_{A^+}(P)\) where convenient - is also a judgment, defined inductively on the structure of the positive proposition \(A^+\) as follows:

  • \(I^+_{p^+}(\Delta. ~ P(\Delta)) = P(p^+) \)
  • \(I^+_{A^-}(\Delta. ~ P(\Delta)) = P(A^-) \)
  • \(I^+_{\bf 0}(\Delta. ~ P(\Delta)) = \top \)
  • \(I^+_{A \oplus B}(\Delta. ~ P(\Delta)) = I^+_{A}(P) \times I^+_{B}(P) \)
  • \(I^+_{\bf 1}(\Delta. P(\Delta)) = P(\cdot)\)
  • \(I^+_{A \otimes B}(\Delta. ~ P(\Delta)) = I^+_{A}(\Delta_A. ~ I^+_{B}(\Delta_B. ~ P(\Delta_A, \Delta_B)) \)
This definition has a certain obvious correspondance to the pattern judgment from "Noam-style" higher order focusing, but it's very direct: given a property \(P\) that involves a linear context, \(I_{A^+}(P)\) proves that property on every context which can prove \(A^+\).

Given this judgment, we can use it pretty straightforwardly to define the inversion judgment. \[ \infer {\Delta, [A^+] \vdash C} {I^+_{A^+}(\Delta_{A}. ~~\Delta, \Delta_{A} \vdash C)} \] The result is exactly same as the formulation that used the higher-order pattern judgment, but this formulation is, in my mind at least, rather more direct while just as appropriately higher-order.

The story for negative types is the same. If \(P(\Delta,C)\) is a judgment on linear contexts \(\Delta\) and conclusions (conclusions are positive propositions \(A^+\) and negative atomic propositions \(p^-\)), then \(I^+_{A^-}(P)\) is defined inductively on the structure of types as follows:

  • \(I^-_{p^-}(\Delta,C. ~ P(\Delta,C)) = P(\cdot,p^-)\)
  • \(I^-_{A^+}(\Delta,C. ~ P(\Delta,C)) = P(\cdot,A^+)\)
  • \(I^-_{A ~\&~ B}(\Delta,C. ~ P(\Delta,C)) = I^-_{A}(P) \times I^-_{B}(P)\)
  • \(I^-_{A \multimap B}(\Delta. ~ P(\Delta,C)) = I^+_{A}(\Delta_A.~I^-_{B}(\Delta,C. ~ P((\Delta_A,\Delta), C)\)
The right-inversion rule follows the same pattern as the left-inversion rule: \[ \infer {\Delta \vdash [ A^- ]} {I^-_{A^-}(\Delta',C. ~~ \Delta, \Delta' \vdash C)}\]

Where's the Beef?

I think this crazy new formulation of focusing is somewhat more elegant than the pattern-judgment-using formulation Taus and Carsten gave, even though it is otherwise rule-for-rule the same. However, that's not an argument that will convince anyone besides myself; there needs to be some advantage to using this formulation for it to have value. In particular, this formulation should make it easier and/or more beautiful to prove cut, identity expansion, and the completeness of focusing - it should at least have similar grace when formalized in Agda to the Agda proof of structural focalization. It turns out that the identity expansion theorem can be done with this new formulation in Agda, and it's beautiful. (When I showed it to Dan Licata he started muttering something something Yoneda Lemma, so it's at least likely the structure of identity expansion has some interesting and common categorial structure.) It's certainly obvious that there's some interesting deep structure at work.

Furthermore, it seems like, given this formulation, it should be possible to get the cut admissibility theorem working as an obviously turned-around version of identity expansion, which is the kind of interesting structure I've had my eye out for ever since I observed it in the process interpretation of linear logic. I haven't made that work yet, though, so I'm going to stick with inversion contexts for now and maybe return to poking at this post-thesis, if nobody else has figured it out in the meantime - or, indeed, if nobody else has already figured it out in the past.

Here are some Agda fragments for dealing with this idea rigid logic (a non-associative, non-unit-law-having version of ordered logic that is useless except as a way of thinking about substructural logics in Agda, as far as I can tell), and some other Agda fragments for persistent logic. This whole blog post should serve as a disclaimer that these are all quite half-baked.


1An aside on "higher-order"

This is the part of higher-order focusing that gives it the name "higher-order," in the sense that the derivation constructors take functions as arguments. I can't quite figure out at this moment how this usage fits into Chris's discussion of what should and should not be called higher order.

Noam called the paper introducing pattern judgments to the PL world "Focusing and higher-order abstract syntax" (POPL 2008). Noam now sees that naming as unfortunate, and I agree; it blurs the definition of "higher-order abstract syntax" (HOAS) in a different direction that it gets blurred elsewhere in the literature. In my idiolect, which of course I can't force on everyone else, I reserve HOAS for the use of substitution functions to encode term structure - in other words, I use it as a rough synonym of what I and others also call "abstract binding trees." (I'm not sure who coined "abstract binding tree," I associate it with Bob Harper.)

The whole point of higher-order focusing is that this function - the one from the pattern judgment \(\Delta' \Vdash A\) to the provability judgment \(\Delta, \Delta' \vdash C\) - is a function that case analyzes its arguments (to figure out which pattern judgment was given as an argument to the function). Putting our set-theorist hats on, it's a map, a potentially infinite list of (pattern derivation, provability derivation) tuples such that every pattern derivation is associated with exactly one provability derivation. Then we take off our set-theorist hats 'cause we dislike those hats; it's a real (pure, terminating, effect-free) function, like in the sense of functional programming. Or just programming. I personally try to use the phrase logical reflection to emphasize this pattern-matching higher-order-ness; Noam now prefers abstract higher-order syntax, which is fine too.

Monday, March 5, 2012

What does focusing tell us about language design?

This post is largely based on some conversations I've had about Polarized Logic/Focusing/Call-By-Push-Value recently, especially with Neel Krishnaswami and Wes Filardo, though it was originally prompted by Manuel Simoni some time ago.

I think that one of the key observations of focusing/CBPV is that programs are dealing with two different things - data and computation - and that we tend to get the most tripped up when we confuse the two.

  • Data is classified by data types (a.k.a. positive types). Data is defined by how it is constructed, and the way you use data is by pattern-matching against it.
  • Computation is classified by computation types (a.k.a. negative types). Computations are defined their eliminations - that is, by how they respond to signals/messages/pokes/arguments.

There are two things I want to talk about, and they're both recursive types: call-by-push-value has positive recursive types (which have the feel of inductive types and/or algebras and/or what we're used to as datatypes in functional languages) and negative recursive types (which have the feel of recursive, lazy records and/or "codata" whatever that is and/or coalgebras and/or what William Cook calls objects). Both positive and negative recursive types are treated by Paul Blain Levy in his thesis (section 5.3.2) and in the Call-By-Push Value book (section 4.2.2).

In particular, I want to claim that Call-By-Push-Value and focusing suggest two fundamental features that should be, and generally aren't (at least simultaneously) in modern programming languages:

  1. Support for structured data with rich case analysis facilities (up to and beyond what are called views)
  2. Support for recursive records and negative recursive types.

Minor addition to "core Levy"

I'll be presenting with an imaginary extension to Bauer's Levy language in this post.1 I've mucked around Levy before, of course, and it would probably help to review that previous post before reading this one. I want to make one addition to Levy before we begin making big, interesting ones. The derived form that I want to add - e1 orelse e2 - is computation code with type F bool if e1 and e1 have type F bool as well. This is definable as syntactic sugar, where x is selected to not be free in e2:

   e1 to x in 
   if x then return true 
   else e2
One other aside - do remember that, at the cost of potential confusion, I modified Bauer's original Levy implementation to make force a prefix operator that binds more tightly than application - force g z y can be written with parentheses as (((force g) z) y).

Positive types, positive recursive types

The functional languages have the construction of positive types just about right: they're made up of sums and products. I used a special LLF-ish construction for datatypes in my exploration of Levy#, but the more traditional way of introducing datatypes is to say that they are recursive types μt.T(t). The recursive types we're used to thinking about are naturally interpreted as positive types, because they are data and are inspected by pattern matching.

There's a tendency in programming language design to shove positive recursive types together with labeled sum types to get a familiar datatype mechanism.2 I will go along with this tendency and merge labeled sums and recursive types, writing them as μt.[L1: T1(t), L2: T2(t), ..., Ln: Tn(t)].

Here are datatype definitions for trees of ints and lists of ints:

   type+ tree = 
   μtree. 
   [ Leaf: int, 
     Node: tree * tree ]

   type+ list = 
   μlist. 
   [ Nil, 
     Cons: int * list ]
Note from the definition of lists that we also allow types to have no arguments: it's possible to treat the definition of Nil as syntactic sugar for Nil: unit. The associated value is Nil, which is syntactic sugar for Nil ().

There are infinitely many trees and lists as we've defined them. In fact, it's just a convention of how we have to write programs that we think of positive types as being finite sums. Even though we can't write it down as a datatype declaration, it's perfectly reasonable to think of a built-in type of infinite precision integers as being defined as follows:

   type+ int = 
   μint. 
   [ 0, 
     1,
     ~1,
     2,
     ~2,
     3,
     ... ]
The same goes for built-in strings and the like - think of built-in types being positive enumerations that were provided specially by the language since the language didn't give us the ability to write down the full declaration containing all possible constants.

Powerful views

A point that has not, in my humble opinion, been made simply enough is that the perspective of focusing says that we should think about integrating much more powerful case analysis mechanisms into our programming languages. I learned about this point from Licata, Zeilberger, and Harper's 2008 Focusing on Binding and Computation, but their setting had enough other interesting stuff going on that it probably obscured this simple point.

Standard ML (and, to the best of my knowledge, all other functional languages circa-1997) only provides a limited form of case analysis - arbitrary finite views into the outermost structure of the datatype:

   case x of
    | Leaf 9 => ...
    | Leaf y => ...
    | Node (Node (Leaf 5, y), z) => ...
    | Node (y, z) => ...
This limitation comes with a nice tradeoff, in that we can pretty effectively estimate how much work compiled code needs to do to handle a pattern match. However, the structure of focusing suggests that any mathematical function from a value's structure to computation is fair game. One well-studied modernization of pattern matching is views, which allows us to group computations in other ways. One use case would be allowing us to take an integer variable x and say that it is even or odd:
   case x of
    | 0 => return true
    | x is even => return false
    | ~1 => return false
    | x is odd => return true
The focusing-aware view of pattern matching suggests that what a pattern match is actually doing is defining a case individually for each value structure - if we could write a countably-infinite-sized program, then we could expand the view-based program above to the no-longer-view-based, but infinitely long, program below:
   case x of
    | 0 => return true
    | 1 => return true
    | ~1 => return false
    | 2 => return false
    | ~2 => return false
    | 3 => return true
    | ...
So: the focusing perspective says that any effect-free (effects include nontermination!) mathematical function we can write from value structure to computations is fair game for case analysis; views are just one known way of doing this in an interesting way. In principle, however, we can consider much richer case-analysis machinery to be fair game. For instance, there is a mathematical function \(f\) from integer values int to computation code of type F bool with the variable coll free:
  • If the integer is 1, the result is return true
  • If the integer is less than 1, the result is return false
  • Otherwise, if the integer is divisible by 2 and the result of dividing the integer by 2 is i, then the result is force coll i
  • Otherwise, let j be 3 times the integer, plus one. Then the result is force coll j
Given this mathematical function, we have the core case analysis at the heart of the Collatz function. If we expand out this case analysis into an infinitely-long mapping as before, it would look like this:
   rec coll: int -> F bool is
     fn x: int =>
       case x of
        | 0 => return false
        | 1 => return true
        | ~1 => return false
        | 2 => force coll 1
        | ~2 => return false
        | 3 => force coll 10
        | ~3 => return false
        | 4 => force coll 2
        | ~4 => return false
        | 5 => force coll 16
        | ...
The important question here is: what is the right way to think about how we write down total functions from value structures? Are views about right, or do we need something different, clearer/more/less powerful? I don't have a sense for the answers, but I'm pretty confident that's the right question.

Views are only over the structure of values

One critical caveat: I used the phrase value structure in the previous discussion repeatedly on purpose. Because computations can always contain effects, we cannot poke at the computations at all when we think about views - the value structure is just the part of the value that we can inspect without traversing a thunk. Therefore, if we have this definition...

   type+ somefn = 
   μsomefn. 
   [ BoolFn: U (bool -> F bool), 
     IntFn: U (int -> F int) ]
...then a case analysis on a value of type somefn can have at most two branches - one for the BoolFn case and one for the IntFn case. We can't pattern-match into computations, so even though it would be reasonable to say that there are many, many values of type somefn, there are only two value structures that can be inspected by case analysis.

Negative types, negative recursive types

In call-by-push-value, the computation types (a.k.a. the negative types of polarized logic/focusing) are defined by the way we use them. Computation code of computation type does stuff in response to being used - a function does stuff in response to getting an argument, and so the most well-understood negative type is implication.

Conjunction (the product or pair type) is a bit slippery, says the CBPV/focusing methodology. We've already used conjunction as a positive type (t1 * t2), but we can also think of conjunction as being a negative type. This is because it's equally valid to think of a pair being positively defined by its construction (give me two values, that's a pair) and negatively as a suspended computation that awaits you asking it for its first or second component.

CPBV/focusing makes us choose: we can have either type of conjunction, or we can have both of them, but they are different. It will be convenient if we keep the normal tuples as positive types, and associate negative types with records - that is, with named products. Records are defined by how you project from them, and just as we tie labeled sums in with positive recursive types, we'll tie labeled products in with negative recursive types.

Negative recursive types are naturally codata in the same way that positive recursive types are naturally data: the canonical example is the infinite stream.

   type- stream = 
   μstream. 
   { head: F int, 
     tail: stream }
It's natural to define particular streams with fixedpoints:
   val Ones = thunk rec this: stream is
   { head = return 1, 
     tail = force this }

   val CountUp = thunk rec this: int -> stream is
   fn x: int =>
   { head = return x, 
     tail = 
       force plus x 1 to y in
       force this y }
Say I wanted to get the fourth element of an infinite stream str of type U stream. The ML-ish way of projecting from records would write this as #tail (#head (#head (#head (force str)))), but I will pointedly not use that notation in favor of a different record notation: (force str).head.head.head.tail. It nests better, and works better in the case that record elements are functions. (Because functions are negative types, like records, it is very natural to have functions be the arguments to records.)

Negative recursive types as Cook's objects

Here's a bit of a case study to conclude. The entities that William Cook names "Objects" in his paper On Understanding Data Abstraction, Revisited are recognizably negative recursive types in the sense above.3 Cook's examples can be coded in Standard ML (see here), but the language gets in the way of this encoding in a number of places.4 To see what this would look like in a language with better support, I'll encode the examples from Cook's paper in my imaginary CBPV implementation. The recursive negative type looks much like a stream, and encodes the interface for these set objects.

   type- iset = 
   μiset. 
   { isEmpty: F bool, 
     contains: int -> F bool, 
     insert: int -> iset, 
     union: U iset -> iset }
One difficulty: the Insert and Union operations that Cook uses involve mutual recursion. I don't have mutual recursion, so rather than pretending to have it (I've abused my imaginary language extension enough already) I'll code it up using records. The record type insertunion I define isn't actually recursive, it's just a record - this is analogous to using ML's datatype mechanism (which, as we've discussed, makes positive recursive types) to define a simple enumeration.
   type- insertunion = 
   μinsertunion.
   { Insert: U iset * int -> iset,
     Union: U iset * U iset -> iset }

   val InsertUnion = thunk
     rec x: insertunion is
     { Insert = fn (s, n) =>
       (force s).contains n to b in
       if b then force s else 
       rec self: iset is
       { isEmpty = return false,
         contains = fn i => (force equal i n) orelse (force s).contains i,
         insert = fn i => (force x).Insert (self, i),
         union = fn s' => (force x).Union (self, s') },

       Union = fn (s1, s2) => 
       rec self: iset is
       { isEmpty = (force s1).isEmpty orelse (force s2).isEmpty,
         contains = fn i => (force s1).contains i orelse (force s2).contains i,
         insert = fn i => (force x).Insert (self, i),
         union = fn s' => (force x).Union (self, s') } }

   val Insert: U (U iset * int -> iset) = thunk ((force InsertUnion).Insert)

   val Union: U (U iset * U iset -> iset) = thunk ((force InsertUnion).Union)
We've defined union of sets and insertion into a set, but we haven't actually defined any sets yet! Once we give the implementation of an empty set, however, we can manipulate these sets with a Java-esque method invocation style:
   val Empty = thunk rec self: iset is
   { isEmpty = return true,
     contains = fn x => return false,
     insert = fn i => force Insert (self, i),
     union = fn s' => force s' }

   val JustOne = thunk (force Empty).insert 1
   (force Empty).insert(3).union(JustOne).insert(5).contains(4)
As Cook noted, it's also very natural to talk about infinite sets in this style, such as the set of all even numbers and the set of all integers:
   val Evens = thunk rec self: iset is 
   { isEmpty = return false,
     contains = fn i => 
       force mod i 2 to x in
       (case x of 
         | 0 => return true
         | _ => return false),
     insert = fn i => force Insert (self, i),
     union = fn s' => force Union (self, s') }

   val Full = thunk rec self: iset is
   { isEmpty = return false,
     contains = fn i => return true,
     insert = fn i => force self,
     union = fn s' => force self }


1 If anyone is interested in helping me implement a toy language along these lines, I'm all ears. Also, the disclaimer that since I don't have an implementation there are surely bugs bugs bugs everywhere is strongly in effect.

2 As a practical matter, this makes it easier for the language to know where to put implicit roll/unroll annotations, so that programmers doesn't have to write these explicitly, which would be a pain.

3 Cook does give the disclaimer that this definition is "recognized as valid by experts in the field, although there certainly are other valid models." I carefully call these things "Cook objects" rather than "objects" - my claim is that negative recursive types correspond to what William Cook names objects, not that they correspond to what you call objects. Note, furthermore, that I could be wrong about what William Cook calls objects! I only have his paper to go on and I have been known to be a few cookies short of a haystack.

4 There are two issues with the encoding in SML. The most basic is that we have to encode the recursive types using SML's recursive type mechanism, which is biased towards the positive encoding of recursive types as finite sums. The other issue is that the way SML ties together (mutual) recursion with closures gets in the way - the standard way of writing a "thunk" as a function from unit is all over the place, and it wan't possible to define EmptySet so that it was mutually recursive with the definition of Insert and Union as a result. So, I'm certainly not arguing that SML facilities programming in (what Cook calls) an object-oriented style in a strong sense - it's unnecessarily painful to code in that style - but the minimal semantic building blocks of what Cook presented and named Object are present and accounted for. [Update:] As gasche discusses in the comments, Ocaml's records, and the way it deals with laziness, make it quite a bit better at encoding the example, but it's still not ideal.

Thursday, January 12, 2012

Structural focalization updated

I've uploaded to both ArXiV and my webpage a significantly revised draft of the paper Structural focalization, which I've spoken about here before. Feedback is welcome!

One of the points I make about the structural focalization technique is that, because it is all so nicely structurally inductive, it can be formalized in Twelf. As part of a separate project, I've now also repeated the whole structural focalization development in Agda! The code is available from GitHub. While a structural focalization proof has some more moving parts than a simple cut-and-identity proof, it also has one significant advantage over every Agda proof of cut admissibility that I'm aware of: it requires no extra structural metrics beyond normal structural induction! (My favorite structural metric is the totally nameless representation, but there are other ways of threading that needle, including, presumably, these "sized types" that everyone seems to talk about.)

In regular, natural-deduction substitution, you can get away without structural metrics by proving the statement that if \(\Gamma \vdash A\) and \(\Gamma, A, \Gamma' \vdash C\) then \(\Gamma, \Gamma' \vdash C\); the extra "slack" given by \(\Gamma'\) means that you operate by structural induction on the second given derivation without ever needing to apply weakening or exchange. Most cut-elimination proofs are structured in such a way that you have to apply left commutative and right commutative cuts on both of the given derivations, making this process tricky at the best; I've never gotten it to work at all, but you might be able to do something like "if \(\Gamma, \Gamma' \longrightarrow A\) and \(\Gamma, A, \Gamma'' \longrightarrow C\) then \(\Gamma, \Gamma', \Gamma'' \longrightarrow C\)." If someone can make this work let me know!

A focused sequent calculus, on the other hand, has three separate phases of substitution. The first phase is principal substitution, where the type gets smaller and you can do whatever you want to the derivations, including weakening them. The second phase is rightist substitution, which acts much like natural-deduction substitution, and where you can similarly get away with adding "slack" to the second derivation. The third phase is leftist substitution, and you can get by in this phase by adding "slack" to the first derivation: the leftist cases read something like "if \(\Gamma, \Gamma' \longrightarrow A\) and \(\Gamma, A \longrightarrow C\) then \(\Gamma, \Gamma' \longrightarrow C\)."

In Structural focalization, I note that the structural focalization technique could be seen as a really freaking complicated way of proving the cut and identity for an unfocused sequent calculus. But in Agda, there's a reason you might actually want to do things the "long way" - not only do you have something better when you finish (a focalization result), but you get cut and identity without needing an annoying structural metric.

Friday, January 6, 2012

Response from ACM's Scott Delman

In the comments to my last post ("Why does the ACM act against the interests of scholars?") ACM's Director of Group Publishing, Scott Delman, left a multiple-comment response. It's a response both to the views I expressed and to the views of others that I summarized. He agreed to have his comments posted as a post here; I'll leave my own thoughts for a separate post or the comments.

Ok, here's the other side of this, which I feel compelled to throw out there after reading Rob's post and a few of the related comments.

Like most things in life, things are not always as black and white as some would lead us to believe. In this case, I think there is a basic misunderstanding of the ACM and the AAP (which is incidentally an organization that does a great deal of good work on behalf of both publishers and the scientific community).

Let's start with the ACM....which is a non-profit organization founded in 1947 by members of the computing community with the primary mission of advancing the field of computing. The Association is organized as a 501(c)3 corporation with daily operations run by a small staff of approximately 75 individuals who ultimately take their direction from a volunteer leadership of hundreds of dedicated scientists, scholars, educators, practitioners, and students who graciously donate a significant amount of their time to direct the Association forward in a way that benefits the computing community as a whole. It is important to point this out, because there is an implication in the original post that the ACM is an entity that is in some way acting against the scholarly community, when in fact the ACM is an organization that is literally run by the scholarly community.

Keeping this in mind, we are either left with a situation in which the scholarly community is either acting against itself by the policies it sets and supports (such as ACM's copyright policy and ACM's subscription model) or something else is going on here. Since it doesn't seem logical or even practical that the top-decision makers at ACM (such as the ACM Publications Board of Volunteers or the ACM Executive Committee of Volunteers, who oversee all major strategic decisions of the Association) would support policies that actively work against the interests of their own community, I think it is more reasonable to suggest that what is going on here is that the issues are not as cut and dry or as simplified as some advocates of "immediate and unrestricted" open access to all scholarly literature would lead us to believe.

Whenever I discuss the topic of open access with colleagues and friends, I think it is useful to try to imagine what the world would look like if the US Federal Government or other Foreign Governments decided to pass legislation that required all scholarly material that is in some way supported by public funding be made instantly open and freely available to the world without any paywalls of any sort. Well, as ACM's publisher and someone who is intimately aware of the tangible costs of publishing and disseminating high quality scholarly literature, I can tell you without a shadow of a doubt that the end result of this sort of legislation would be catastrophic for the scientific community and scholarly publishers alike. If in a blink of an eye, organizations like ACM were required to simply open up our archive of articles (the ACM DL) without the ability to recoup the costs of publishing and disseminating those articles (or all of the technically sophisticated services built around that content inside the ACM DL), ACM would cease to be the sustainable organization it is today and would eventually shutter its doors at some point in the future, instead of continuing to be the sustainable force for good that it is today. If this sounds like PR-dribble, I apologize, but I really do believe this!

What's more, the senior volunteers who are most familiar with ACM's activities and who sit on ACM's various committees and boards recognize and understand the tradeoffs that are necessary to maintain a sustainable organization. Over the past few years, I have participated in meetings with our Publications Board, which is the governing body for publications related strategy and decisions at ACM, where the issues of open access and alternative business models have been repeatedly discussed, and when all of the facts have been taken into consideration it has been overwhelmingly clear to these members of the community that ACM's approach is in the best longterm interests of the scholarly community. In fact, the ACM Author-Izer service, which is written about in the above post, was conceptualized at one of these meetings as the result of an in-depth discussion about how to balance the legitimate need of our authors to make the "archival versions" of their articles openly available while at the same time preserving the revenue stream that ACM relies heavily on to do its good work. ACM's pre-existing copyright policy already addressed the issue of posting "accepted versions" of an author's work, but ACM's volunteers decided that it was even more beneficial for the community if the "archival versions" could be made available from the author's site using the "Author-Izer" perpetual link. In general, while Author-Izer is still relatively new, the initial responses have been extremely positive and there is widespread recognition (including Rob's above) that this is a step in the right direction....

Let me briefly address the "opposing views" raised in Rob's post. First, in an instance where an author graduates, moves, or retires, it is always possible for the initial link to be replaced by a more up-to-date link. The ability to manage the URL that hosts the link is in the hands of the author, so I don't see a significant issue here and at the very least the effort on behalf of the author is no greater (and perhaps significantly less) than it would be to move their vitae or "pre-published" articles to a new website. What's more, ACM has simplified this process for authors and eliminated the confusion that is caused by having "multiple versions" of articles available on multiple websites by creating a centralized place (their ACM Author's Page, which includes all of their ACM and non-ACM publications) from which authors can update their Author-Izer links. By hosting the archival version of the article on a single and "sustainable" site, we strongly believe this is a better solution for the community.

In relation to argument from Russell O'Connor, I reject the plausibility or even the possibility that the ACM might "go evil" for the reasons I've outlined above. Since ACM ultimately carries out the well thought out wishes of the community itself since the decision makers are the very members of the community who are impacted by those decisions, it is just not possible for such a scenario to occur. Bankrupt is another story, since it is always impossible to predict how an organization's finances will be managed in the future, even though for the record it is exactly the kind of decision making I've mentioned above that currently keeps the ACM is a very strong position. Nevertheless, contingencies are in place for this unlikely scenario, as it relates to ACM's publications and all articles in the ACM Digital Library. Several years ago, ACM established partnerships with two very well established organizations (CLOCKSS & Portico) to ensure that ACM's publications would be preserved and made available to the scientific community (at no cost) in the unlikely event that ACM ceased to exist. [Rob's note: here's a press release about that.] Both organizations take different approaches to longterm digital preservation, but both are non-profits that exist for the sole purpose of maintaining a longterm perpetual archive for the scholarly community and nearly all major scientific publishers participate in one or both of these initiatives. ACM participates in both to provide an even higher level of redundancy than most other publishers. So, it is not clear what would happen to Author-Izer in the event of this doom-day scenario, but what is for certain is that ACM's archive would be made available to the scholarly community in any event.

Lastly, it is worth noting that the AAP is one of the publishing industries' primary advocates and they do an enormous amount of good work. Rather than deriding this organization that supports and protects the interests of over 300 well established publishers, including ACM, I would suggest that we focus on the spirit of what the Research Works Act represents, which is to limit the ability of the federal government to mandate decisions that would almost certainly have a longterm catastrophic impact on an industry that partners with and supports (and in our case is one and the same) the scientific community.

Respectfully,

Scott Delman
Director of Group Publishing
Assoc. Computing Machinery

Thursday, January 5, 2012

Why does the ACM act against the interests of scholars?

[Updated Jan 6, Jan 7] Some stuff has been happening! I'm delighted by two developments. First, ACM's Director of Group Publishing, Scott Delman, wrote a series of comments that is now one big post: Response from ACM's Scott Delman. Second, I've observed that many other people came to the same conclusion I did - that it's time for our professional organizations to leave the Association of American Publishers. The reason I brought up ACM Author-izer was to argue that Author-izer makes sense only insofar as the CS community trusts their professional organization; I remain of the view that membership in the AAP is incompatible with this trust. Here's Cameron Neylon saying that a bit more forcefully, and here's John Dupuis, who is also compiling a list of all the things related to RWA. (Did I mention the AAP also supports SOPA? Yep, awesome.)


I've got two logic posts queued up in the To Do list. But, dear the internet, we need to talk about the ACM. TL;DR is bold and italics at the bottom.

My friend Glenn Willen tweeted something about the Research Works Act last night. [Update: you can read the very short bill here] Basically, it would (among other things) require that the government can't demand that publishers of federally-funded research make their research available to the public. You should really read the press release; it just a wonderful example of the "dripping in PR" genre of literature.

This is not shocking. Awful legislation gets introduced all the time with names ("Research Works Act") that do the opposite of what their title suggests (preventing research from working and acting, wildly attempting to maintain an ultimately unsustainable status quo). Frankly, I expect publishers to behave this way, and I expect there to be the usual variety of opinions about it. But then I ran through the members of the Association of American Publishers, the group which is cheering this legislation that the (presumably) they wrote, hoping against hope. I was unsurprised but a bit sickened by what I saw: the Association for Computing Machinery is a member of the AAP.

I like the ACM, I am proud of my membership in the ACM and ACM SIGPLAN, the Special Interest Group on Programming Languages. I personally think that the ACM's republication policies have been pretty reasonable during the time I've inhabited this academic world. I'm also proud of my involvement with ACM through their student magazine, XRDS. I write profiles of fascinating people, all of which are available on my personal webpage, for free, through the ACM Author-izer service.

Let's talk about that Author-izer

When I publish anything through the ACM, they own the copyright. This is totally fine-by-me for things I write for XRDS (when I worked for the Daily Princetonian in college they owned copyright on my work for them as well). In my mind, it's a little more complicated when I publish stuff I wrote in an ACM conference proceedings. I want to make sure people have access to that research-ey material - my reputation and career opportunities depend on people finding, reading, and liking my work, but in ACM's Digital Library it's behind a "paywall," accessible only to ACM members and people on university networks. The ACM (unlike IEEE) provides a couple of different-sized hoops that you can jump through to provide free access to your work from your personal home page; Author-izer is the newest of these.

On a technical level, ACM Author-izer lets you, the author of a work that the ACM now has copyright to, bless a particular URL on the internet (presumably your personal home page). The ACM then gives you a special link to their Digital Library - if you're coming from the blessed URL to the special link, you get access to the research. It sounds a little goofy but it works for me in practice and I'm cautiously pleased with it. (Here's Andrew Appel talking about Author-izer if you'd like a concurring opinion.)

But there's another view that Author-izer is a step backwards - because moving a web page (upon graduation or retirement) breaks the functionality of Author-izer links, ACM gets, in the long run, more exclusive content than if people were posting semi-definitive versions of papers on their web page. This is not a crazy concern, but I feel like lots of universities archive alumni's pages in-place, so I also don't feel too worried about it.

There's another argument I've read (UPDATE: from Russell O’Connor, I'd forgotten the link but Joshua Dunfeld reminded me in the comments). It's plausible, more insidious, and more long-term. The ACM might "go evil" in some way, sure, but even positing that the ACM is and will remain reasonably virtuous, what if the ACM goes bankrupt? In the bankruptcy proceedings, some copyright trolls get the rights to everything in the digital library, immediately shut down Author-izer, and start wreaking havoc on academia (threatening lawsuits and demanding money who posted ACM-published works to their webpage) because they're copyright trolls and that's how they roll. A lot of people are violating the letter of their agreements when they post work to their web pages - you're not allowed to post the ACM's PDF, and in fact my reading of the agreement is that you have to change the ACM copyright notice in your version of the paper to a specific other thing; most people don't do this. Of course the ACM-provided LaTeX class doesn't support this, so you have to go diddling around with .cls flies to produce a PDF that looks like this - see the lower-left-hand corner of the first page. Because people are less likely to jump (correctly) through the "author's version" hoop, instead relying on Author-izer, in this hypothetical narrative the ACM's policies have, indeed, worked against the interests of ACM's members.

What does this have to do with the "Research Works Act" breaking research?

My view of Author-izer is that it requires a high level of trust: trust that the ACM will continue supporting authors, and that we'll be able to continue supporting the ACM (since if we don't or can't support the ACM, it will go bankrupt and be taken over by copyright trolls). I can overlook little things where the ACM is not acting in the interest of its members (why doesn't the standard .cls make it easy to make an authors version?) because the world isn't perfect.

Furthermore, with experiments like Author-izer, I believe that ACM has demonstrated that it's trying to do the right thing, as opposed to IEEE, which doesn't give authors hoops to jump through to legally post their work to their webpages. (You should read IEEE's hilariously awful responses to Matt Blaze on this issue. Blaze, I should add, sees much less difference between ACM and IEEE than I do.)

However, the "Research Works Act" makes it clear that ACM's membership in the Association of American Publishers is an egregious and unacceptable instance of working against the interest of scholars and ACM members. We should be thinking about how to demand that our professional organization, the Association for Computing Machinery, do two things: 1) withdraw from the Association of American Publishers 2) take the clear position that the so-called "Research Works Act" is an unacceptable piece of legislation that is not supported by the computer science community.

We should do this even though (I suspect) the primary target of this ridiculous act is medical science. (At the present time, the NIH admirably puts a higher priority on open dissemination of research than the NSF.)


P.S. added January 7: The title of this post is intentionally re-asking the question Blaze asks in his aforementioned post, "Why do IEEE and ACM act against the interests of scholars?" I am focusing on the ACM simply because I care more about the ACM. The IEEE is also a member organization of the AAP.