Thursday, April 19, 2012

Charity and trust

A while back, I complained about the ACM's membership in the AAP, a publishing trade organization that supported some odious legislation like SOPA and the Research Works Act. The ACM, to their credit, responded to the concerns that the community at large raised (there were many others). ACM's Director of Group Publishing Scott Delman did so in this space, and ACM President Alain Chesnais did so on his blog. (Side note: in that post, and especially in the comments to it, Chesnais seemed to come really close to saying something like "The ACM does not and will not support the RWA, SOPA, or the AAP's lobbying, because the ACM doesn't take political stances in either way as a matter of policy" but he also seemed to take really great pains to not say anything that clear and specific, which was baffling to me.)

One of the arguments I used in that previous post was about Author-izer, the new ACM service that allows you to provide the official version of your papers for free from your personal webpage. The reason I brought that in to the argument is that Author-izer is basically a good idea insofar as the CS community trusts the ACM to act in their interests. I still support the argument that their membership in the AAP runs directly counter those interests and thereby necessarily weakens that trust. Certainly, though, life's complicated and everyone gets to decide as an individual if they want to be a member (and thereby a supporter) of the ACM despite their affiliation with the AAP, just as the ACM as an institution gets to decide if it (we!) want to be a member (and thereby a supporter) of the definitely-lobbying AAP despite having a no-lobbying policy.

Andrew Appel, at the Freedom to Tinker blog, has a series of three blog posts presenting four ways that academics can think about their relationship to copyright. I had fun reading them, and they provide an interesting alternative answer to why trusting the ACM to act in our interests is important. The model that Appel suggested that he subscribes to sees signing away one's valuable research papers as a significant charitable contribution. He's comfortable giving that charitable contribution both to ACM and IEEE, presumably because he broadly supports them and their roles, and because the cost to the world as a whole of the donation, relative to just making the donation truly free, is maybe not all that large. (I could note that reviewing papers also fits well within this model of charitable contribution.)

The idea that adopting this view is a reasonable compromise, as Appel suggests, seems pretty reasonable to me, while remaining sympathetic (and potentially responsive) to concerns "that ACM and IEEE have fallen behind the curve." I certainly try to donate to organizations I believe in and trust! I would therefore certainly pause before making a in-kind charitable contribution to a highly profitable Dutch multinational that spearheads legislation like the RWA and lends credibility to homeopathy, but that reflects my values and doesn't need to reflect yours.

Tuesday, April 3, 2012

Differently-higher-order-focusing

I was toying with an idea last weekend, but it passed the "this is taking too much of your time, go back to your thesis" threshold, so I'm going to jot down what I figured out here, link to some Agda code, and maybe return to it in the future.

I'm going to use intuitionistic multiplicative, additive linear logic (without the exponential \({!}A^-\)) in this note. Doing it this way is the easiest way to explain everything, though it does mean I'm assuming some familiarity with intuitionistic linear logic. I'm also not going to use shifts; the Agda formalizations do use shifts, but the difference isn't important.

In polarized linear logic without shifts, there's one big syntactic class \(A\) with all the propositions from linear logic, but they are sorted into positive and negative propositions based on their topmost connectives. Atomic propositions can be given arbitrary polarity \(p^+\) or \(p^-\) as long as that polarity is consistent.

  • \(A^+ ::= p^+ \mid {\bf 0} \mid A \oplus B \mid {\bf 1} \mid A \otimes B\)
  • \(A^- ::= p^- \mid \top \mid A ~\&~ B \mid A \multimap B\)

Quick review of higher-order focusing

Higher-order focusing as originally defined by Noam takes as the definition of connectives as given by a pattern judgment. Pattern judgments for positive types track the construction of terms; a reasonable way of reading \(\Delta \Vdash A^+\) is that it says "given all the stuff in \(\Delta\), I can construct a proof of \(A^+\)."

\[ \infer {p^+ \Vdash p^+} {} \qquad \infer {A^- \Vdash A^-} {} \]\[ \infer {\Delta \Vdash A \oplus B} {\Delta \Vdash A} \qquad \infer {\Delta \Vdash A \oplus B} {\Delta \Vdash B} \]\[ \infer {\cdot \Vdash {\bf 1}} {} \qquad \infer {\Delta_A, \Delta_B \Vdash A \otimes B} {\Delta_A \Vdash A & \Delta_B \Vdash B} \]

Similarly, the negative pattern judgment tracks how negative types are used. A reasonable reading of the pattern judgment \(\Delta \Vdash A^- > C\) is that it says "given all the stuff in \(\Delta\), I can use a proof of \(A^-\) to prove \(C\)."

\[ \infer {\cdot \Vdash p^- > p^-} {} \qquad \infer {\cdot \Vdash A^+ > A^+} {} \]\[ \infer {\Delta \Vdash A ~\&~ B > C} {\Delta \Vdash A > C} \qquad \infer {\Delta \Vdash A ~\&~ B > C} {\Delta \Vdash B > C} \qquad \infer {\Delta, \Delta' \Vdash A \multimap B > C } {\Delta \Vdash A & \Delta' \Vdash B > C} \]

One of the nicest things about higher-order focusing, in my opinion, is that it removes the question of what order the invertible left rules get applied in altogether: when we come to a point where invertible rules must be applied, the invertible rules happen all at once by logical reflection over the pattern judgment.1

\[ \infer {\Delta, [A^+] \vdash C} {\forall(\Delta' \Vdash A^+) \longrightarrow \Delta, \Delta' \vdash C} \qquad \infer {\Delta \vdash [A^-]} {\forall(\Delta' \Vdash A^- > C) \longrightarrow \Delta, \Delta' \vdash C} \]

In particular, this means that a derivation of \(\Delta, [p^+ \oplus (A^- \otimes q^+) \oplus {\bf 1}] \vdash C\) has three immediate premises corresponding to the three one of which is a derivation of \(\Delta, p^+ \vdash C\), one of which is a derivation of \(\Delta, A^-, q^+ \vdash C\), and one of which is a derivation of \(\Delta \vdash C\).

This "do all the inversion all at once" idea is a strengthening of the idea that appears in Andreoli's original presentation of focusing, in the definition of CLF, in Kaustuv Chaudhuri's focused linear logic, and in my work on structural focalization. All of these aforementioned presentations have a separate, ordered inversion context that forces you to decompose positive propositions into contexts and negative propositions in one arbitrary-but-fixed order, which kind of lets you pretend they all happen at once, since you can't make any choices at all until you're done. This isn't the only way to look at it, though. In Miller and Liang's LJF, in Frank Pfenning's course notes on Linear Logic, and in probably a bunch of other places, the inversion steps can happen in any order, though for a fully focused system you have to do get through all the possible inversion steps before you focus again. In order to get the strength of a Andreoli/Chaudhuri/me-focused systems in an LJF-like setting, one must prove a Church-Rosser-esque property that all ways of doing inversion are the same.

The reason I think the use of higher-order formulations is particularly nice is that it blows away the distinction between systems that fix the inversion order (like Structural Focalization and its many predecessores does) and systems that allow nondeterminism in the inversion order (like LJF and its kin). There's no question about the presence or absence of inessential nondeterminism when all the nondeterminism seriously happens in one rule that has as many premises as necessary to get the job done.

One of the less-awesome things about higher-order focusing is that it requires this extra substitution judgment, a complication that I won't go into here but that gets increasingly annoying and fiddly as we go to more and more interesting substructural logics, which are of particular interest to me. (This is not to say that it doesn't work, I wrote a note on higher-order focusing for ordered logic way back two years ago when Request For Logic was a series of Unicodey-notes instead of a blog.) To try to get the best of both worlds, Taus and Carsten's Focused Natural Deduction is the only system I know of that tries to use the pattern judgments just for negative introduction/positive elimination while using the familiar negative eliminations and positive introduction structures.

I want to play the same game Taus and Carsten play, but without using the pattern judgments at all. I think I have a better idea, but it ended up being more work than I have time for to actually prove that it's better (or to conclude that I'm wrong and it's not).

Alternate formulation

If \(P(\Delta)\) is a judgment on linear contexts \(\Delta\), then \(I^+_{A^+}(\Delta. P(\Delta))\) - which we can mathematically \(\eta\)-contract and write \(I^+_{A^+}(P)\) where convenient - is also a judgment, defined inductively on the structure of the positive proposition \(A^+\) as follows:

  • \(I^+_{p^+}(\Delta. ~ P(\Delta)) = P(p^+) \)
  • \(I^+_{A^-}(\Delta. ~ P(\Delta)) = P(A^-) \)
  • \(I^+_{\bf 0}(\Delta. ~ P(\Delta)) = \top \)
  • \(I^+_{A \oplus B}(\Delta. ~ P(\Delta)) = I^+_{A}(P) \times I^+_{B}(P) \)
  • \(I^+_{\bf 1}(\Delta. P(\Delta)) = P(\cdot)\)
  • \(I^+_{A \otimes B}(\Delta. ~ P(\Delta)) = I^+_{A}(\Delta_A. ~ I^+_{B}(\Delta_B. ~ P(\Delta_A, \Delta_B)) \)
This definition has a certain obvious correspondance to the pattern judgment from "Noam-style" higher order focusing, but it's very direct: given a property \(P\) that involves a linear context, \(I_{A^+}(P)\) proves that property on every context which can prove \(A^+\).

Given this judgment, we can use it pretty straightforwardly to define the inversion judgment. \[ \infer {\Delta, [A^+] \vdash C} {I^+_{A^+}(\Delta_{A}. ~~\Delta, \Delta_{A} \vdash C)} \] The result is exactly same as the formulation that used the higher-order pattern judgment, but this formulation is, in my mind at least, rather more direct while just as appropriately higher-order.

The story for negative types is the same. If \(P(\Delta,C)\) is a judgment on linear contexts \(\Delta\) and conclusions (conclusions are positive propositions \(A^+\) and negative atomic propositions \(p^-\)), then \(I^+_{A^-}(P)\) is defined inductively on the structure of types as follows:

  • \(I^-_{p^-}(\Delta,C. ~ P(\Delta,C)) = P(\cdot,p^-)\)
  • \(I^-_{A^+}(\Delta,C. ~ P(\Delta,C)) = P(\cdot,A^+)\)
  • \(I^-_{A ~\&~ B}(\Delta,C. ~ P(\Delta,C)) = I^-_{A}(P) \times I^-_{B}(P)\)
  • \(I^-_{A \multimap B}(\Delta. ~ P(\Delta,C)) = I^+_{A}(\Delta_A.~I^-_{B}(\Delta,C. ~ P((\Delta_A,\Delta), C)\)
The right-inversion rule follows the same pattern as the left-inversion rule: \[ \infer {\Delta \vdash [ A^- ]} {I^-_{A^-}(\Delta',C. ~~ \Delta, \Delta' \vdash C)}\]

Where's the Beef?

I think this crazy new formulation of focusing is somewhat more elegant than the pattern-judgment-using formulation Taus and Carsten gave, even though it is otherwise rule-for-rule the same. However, that's not an argument that will convince anyone besides myself; there needs to be some advantage to using this formulation for it to have value. In particular, this formulation should make it easier and/or more beautiful to prove cut, identity expansion, and the completeness of focusing - it should at least have similar grace when formalized in Agda to the Agda proof of structural focalization. It turns out that the identity expansion theorem can be done with this new formulation in Agda, and it's beautiful. (When I showed it to Dan Licata he started muttering something something Yoneda Lemma, so it's at least likely the structure of identity expansion has some interesting and common categorial structure.) It's certainly obvious that there's some interesting deep structure at work.

Furthermore, it seems like, given this formulation, it should be possible to get the cut admissibility theorem working as an obviously turned-around version of identity expansion, which is the kind of interesting structure I've had my eye out for ever since I observed it in the process interpretation of linear logic. I haven't made that work yet, though, so I'm going to stick with inversion contexts for now and maybe return to poking at this post-thesis, if nobody else has figured it out in the meantime - or, indeed, if nobody else has already figured it out in the past.

Here are some Agda fragments for dealing with this idea rigid logic (a non-associative, non-unit-law-having version of ordered logic that is useless except as a way of thinking about substructural logics in Agda, as far as I can tell), and some other Agda fragments for persistent logic. This whole blog post should serve as a disclaimer that these are all quite half-baked.


1An aside on "higher-order"

This is the part of higher-order focusing that gives it the name "higher-order," in the sense that the derivation constructors take functions as arguments. I can't quite figure out at this moment how this usage fits into Chris's discussion of what should and should not be called higher order.

Noam called the paper introducing pattern judgments to the PL world "Focusing and higher-order abstract syntax" (POPL 2008). Noam now sees that naming as unfortunate, and I agree; it blurs the definition of "higher-order abstract syntax" (HOAS) in a different direction that it gets blurred elsewhere in the literature. In my idiolect, which of course I can't force on everyone else, I reserve HOAS for the use of substitution functions to encode term structure - in other words, I use it as a rough synonym of what I and others also call "abstract binding trees." (I'm not sure who coined "abstract binding tree," I associate it with Bob Harper.)

The whole point of higher-order focusing is that this function - the one from the pattern judgment \(\Delta' \Vdash A\) to the provability judgment \(\Delta, \Delta' \vdash C\) - is a function that case analyzes its arguments (to figure out which pattern judgment was given as an argument to the function). Putting our set-theorist hats on, it's a map, a potentially infinite list of (pattern derivation, provability derivation) tuples such that every pattern derivation is associated with exactly one provability derivation. Then we take off our set-theorist hats 'cause we dislike those hats; it's a real (pure, terminating, effect-free) function, like in the sense of functional programming. Or just programming. I personally try to use the phrase logical reflection to emphasize this pattern-matching higher-order-ness; Noam now prefers abstract higher-order syntax, which is fine too.