The Three Valued Halting Problem is Undecidable
                         Kaz Kylheku <kaz@kylheku.com>
                                January 15 2015

Abstract:

  Some researchers do not accept the conventional halting problem proof, namely
  that the halting function is not computable. A three-valued version of
  the problem has been proposed to address the alleged problems. However,
  this succumbs to the same techniques. It is shown via an informal proof
  sketch that adding a third value to the range of a halting decider
  computation, such that computations can use this third value to complain
  that they are being tricked, does not work. The proof sketch shows that such
  a third value is not total computable, which means that three-valued halting
  computations cannot reliably detect the "self-referential trickery" of the
  conventional halting problem proof.

Background:

  The halting function is a boolean-valued, two-parameter mathematical function.
  Its parameters are a program and an input. The halting function is true for a
  <program, input> pair which terminates: that is to say, if that program
  terminates when operating on that input. The halting function is false for a
  <program, input> pair when the program does not terminate when processing
  that input. The halting function is total: it provides a boolean value for
  all possible programs and inputs.

  It is not possible to create a computational function (or procedure) which
  calculates the halting function: a procedure which is total.
  That is to say, a procedure which can examine any <program, input> pair,
  perform a calculation, and then terminate, providing an output value which
  corresponds to the halting function's value. This was first shown by Alan
  Turing[1].

  There is a proof of this which involves a trick. For any procedure which is
  proposed as a halting decider (a computational implementation of the halting
  function), it is possible to create a wrapper program which incorporates that
  halting decider procedure. The program passes a representation of itself, and
  to its input down to the halting decider, takes the return value of the
  decider, and then behaves in a way which contradicts that return value: if
  the return value is true ("the program halts on that input") then the program
  deliberately fails to halt, by entering an infinite loop. If the return value
  is false ("the program does not halt"), the program immediately halts.  By
  itself, this does not create a counterexample which proves that the halting
  decider is wrong. One more trick is needed: the program is run, with that
  program itself as its own input. At that point, the decider's return value is
  a comment on the very program which is running, and is logically contradicted
  by the behavior of that program. This contradiction shows that the halting
  decider is computing, for that test case, the wrong value, opposite to the
  halting function. If the halting function says that the program, given itself
  as input halts, the halting decider says that it does not halt, and
  vice versa.

  The proof can be sketched like this:

    program program(input) {
      procedure halting_decider(program, input) {
        # performs some computation and returns true or false
      }

      procedure main(input) {
        if (halting_decider(this_program, input))
          loop_forever
        else
          exit(0)
      }

      main(input)
    }

  The program is then invoked with itself as input so that input is this_program,
  and so halting_decider is called as halting_decider(this_program,
  this_program).   Thus the program which is executing is the program invoked on
  itself, and the decider is asked to compute whether that exact test case halts
  or not. Of course, the test case is contrived to contradict the decider.

  Every conceivable implementation of halting_decider succumbs to this trick,
  which shows that for every every attempt to make a halting decider,
  test cases can be found which break it, which shows that no procedure
  can compute the halting for all possible values.

  For detailed formal proofs related to the halting problem and decidability in
  general, the reader can consult [2].

Motivation:

  Some researchers take issue with the proof of the halting problem, claiming
  that even though it is true, it is a contrived trick and that the test cases
  which are generated by that trick are somehow invalid, the implication being
  that deciders may be legitimately excused for not being able to to determine
  whether or not these cases halt, and that these cases do not represent
  a true, palpable limit on computability.

  It has been proposed that a three-valued version of the halting problem
  can provide the necessary framework which allows halting deciders to
  effectively diagnose situations when they are subject to "unfair"
  test cases.  Under this notion, the halting function is augmented to have a
  third possible return value, so that for each <program, input> pair, it
  assigns the value true, false or error.  The new error value has some meaning
  such as that <program, input> test case constitutes a test case which is
  trying to break the decider through some self-referential trick.

  This paper presents a proof sketch which suggests that the three-valued
  halting function is also not total computable. Using a more refined trick,
  three-valued halting deciders can be shown to fail. That is to say, for any
  halting decider, a test case can be found such that it incorrectly returns
  error in a situation when no trick is going on: a situation when that decider
  could be altered to return true or false without being contradicted.
  Or else, a test case can be found when that decider incorrectly returns
  true or false which are contradicted, failing to identify that as
  an error situation.

The Proof:

  We begin by forming a model of the three-valued decider, and argue
  that this model is completely general: any three-valued decider
  which totally computes the three-valued halting function can conform
  to this model.

  The three-valued decider has to perform some computation which tells it
  whether to return error, or whether to determine the boolean halting value.
  In other words, it has this structure:

    procedure halting_decider(program, input) {
      procedure should_return_error(d, p, i) {
        # ...
      }

      procedure decide_halting(p, i) {
        # ...
      }

      if (should_return_error(source_code(decide_halting), program, input)) {
        return error
      } else {
        return decide_halting(program, input) # strictly true or false
      }
    }

  The procedures should_return_error and decide_halting are subroutines of
  halting_decider: they are effectively part of that procedure, and their
  implementation may vary among various proposed halting_deciders.

  In the above, source_code operator denotes some sort of quoting: a way to
  obtain a representation of a given procedure as code which can be analyzed
  (as opposed to just a reference which can be invoked).

  The attack on the three-valued halting_decider looks like this:

    program program(input) {
      procedure halting_decider(program, input) {
        # performs some computation and returns error, true or false
      }

      procedure error_decider(decider, program, input) {
        # performs some computation and returns strictly true or false
      }

      procedure main(input) {
        if (error_decider(source_code(halting_decider), input, input)) {
          switch (halting_decider(input, input)) {
          case true:
            exit(0)
          case false:
            loop_forever
          case error:
            print("incorrect error case: no contradiction is going on!")
            abort()
          }
        } else {
          switch (halting_decider(input, input)) {
          case true:
            loop_forever
          case false:
            exit(0)
          case error:
            print("correct error case: contradiction is being perpetrated!")
            abort()
          }
        }
      }

      main(input)
    }

  As before, the program is invoked on itself as input.

  Here is how this setup differs from the two-valued halting problem proof.

  There is a now a new procedure error_decider. Whereas the two-valued proof
  uses a boiler-plate wrapper program that easily produces invalid test
  cases upon the mere insertion of the halting decider, the three-valued proof
  requires the attacker to not only insert the halting decider under test, but
  to provide suitable error_decider procedure. When a suitable error_decider is
  provided, then when the resulting program is invoked on itself, the
  halting_decider is shown to be erroneous on that input case.

  What is a suitable error_decider? The suitable error_decider is a procedure
  which, for at least the relevant input arguments required in the test case,
  produces the same result as the halting_decider's should_return_error
  subroutine.

  In the case when error_decider(source_code(halting_decider), program, input)
  happens to produce the same value as should_return_error(program, input), the
  halting_decider calculates the incorrect value.

  In this situation, should_return_error might return true, indicating
  to halting_decider that it should return the error code, because a trick is
  supposedly being instigated.  But if that is so, then the wrapper program's
  error_decider also returns true, causing the main procedure to execute the
  consequent branch of the if statement, in which no trickery is going on: the
  true value maps to termination, and false to non-termination.  The decider's
  error case ends up here and is reported as being incorrect.  On the other
  hand, if should_return_error indicates false, then the halting_decider will
  return true or false instead of error. But error_decider also returns false,
  and so the halting_decider's true or false value is routed to the second
  switch statement, in the antecedent "else" branch of the if inside main.
  In that statement, the true or false is met with contradicting behavior,
  showing it to be erroneous, similarly to the situation in the two-valued
  halting proof.

  All that is left is question: does an error_decider exist which
  attacks any possible halting_decider?  This is true because halting_decider
  has to compute somehow whether or not to return error. That computation
  corresponds to a procedure (which we modeled as should_return_error: a
  subroutine of the halting_decider).

  A procedure is a finite entity chosen from a countably infinite enumerable
  set (the set of all possible procedures).

  In the set of all possible error_deciders, there are procedures which
  match the computation of should_return_error.  The should_return_error
  procedure itself necessarily occurs in this set! Not only that, but
  other procedures which match the return value of should_return_error
  on just the relevant arguments: those which occur in the test case
  of interest.

Remarks:

  The concept of a three-valued halting decider procedure with only two
  parameters is troublesome, because we would like to maintain the property that
  we are exploring the relationship between a mathematical function, and a
  procedure which computes it.  Yet, the error value appears to be a property
  of the decider procedure, not of the function: it refers to the inability of the
  decider to compute something.  How this is resolved is that in the
  three-valued decider situation, the underlying mathematical function must in
  fact be regarded as in fact having three parameters: halt3(decider, program,
  input). The function either decides whether the <program, input> pair halts.
  Or else it declares error.

  Here is the definition of what it means when the value of halt3(d, p, i)
  is error:

    1. The boolean decider d(p, i) returns true or false, and it is incorrect.
    2. Another decider d' is also incorrect on d'(p', i').
    3. These two deciders are related together in that d' is just the
       logical inverse of d; that is d'(x,y) = not d(x, y).
    4. The program p contains a procedure equivalent to d, and the only
       difference between p and p' is that inside p', the equivalent of d is
       replaced with an equivalent of d'.
    5. If i = p, then i' = p'. Otherwise i' = i.

  If 4 and 5 are the only differences between the input cases, <p, i>
  and <p', i'>, and if d(p, i) is wrong, such that d'(p',i') is also wrong
  where d' = not d, it must be that these test cases are using a
  self-referential trick to make d and d' wrong.

  Thus if a three-valued decider d3 correctly computes the three-valued
  halting function, it must determine that its two-valued decider d
  is incorrect according to the above definition.

  d3 only takes two parameters, because the third parameter is implicit.
  The decider only computes the halt3 function partially, for a particular
  value of the decider:

    procedure d3(p, i)
    {
       procedure e(d, p, i) { calculates the error function }
       procedure d(p, i) { boolean halting decider }

       if (e(source(d), p, i))
         return error;
       else
         return d(p, i)
    }

  A given procedure d3 tries calculate halt3(d, p, i) for only a fixed d,
  where d represents d3's built-in decider d.

Objection 1:

  I hereby anticipate the following objection to the proof. Surely,
  if error_decider is found which is similar to should_return_error,
  then in fact since should_return_error is a subroutine of the
  halting_decider, then in fact the decider is the target of foul play.

  However, this is not the case. The error_decider function isn't actually
  similar to should_return_error, let alone the same. It only has to have the
  same value as should_return_error for one single combination of argument
  values: that which occurs in the test case. We have to consider this in the
  light of the argument space of these functions being infinite! Two
  functions over infinite domains which agree in just one value are hardly similar at
  all, by any sane definition of similar.

  The error_decider is independently discovered; there is no requirement that
  it be reverse-engineered out of any piece of the halting decider.

  By coincidence, error_decider could perform exactly the same computation as
  should_return_error, but this is not a requirement of the proof.
  Furthermore, more than one error_decider exists that can produce the
  necessary test case.

Objection 2:

  The following objection may be made, which stems from a misunderstanding.
  When halting_decider returns error, and that error is erroneous,
  it is in fact not erroneous. After all, the halting decider is being
  gamed by a trick. The wrapper might print the diagnostic that error
  is incorrect, but that doesn't make it incorrect!

  The problem with this objection is that it proceeds from a misunderstanding
  about what error means. The three-valued halting decision procedure
  contains a boolean procedure for deciding the error case, and it contains
  a boolean halting decider. The error return value is a comment upon the
  boolean halting decider; it is not a self-referential comment about the
  three-valued halting decider. The error value means that the boolean
  halting decider is not able to answer true or false (due to trickery).

  In the situation when error is the incorrect value, it can be shown that the
  two-valued decider in fact has the correct answer. Only, that answer was
  suppressed by the three-valued procedure, due to the should_return_error
  subroutine being incorrect. That the correct answer was suppressed by an
  incorrect error decision is objectively true, and the program's diagnostic
  message is only a remark on that fact.

  How we know that error is an incorrect value is that we can produce
  a new version of halting_decider in which we make a correction to the
  should_return_error function, but leave the boolean halting decider
  exactly the same. When this new halting_decider is plugged into the
  wrapper program, with the original error_decider, the new program
  will show that the boolean halting decider in fact is correct --- when
  the new program is run with the old program's code as its input.

  That is to say, if "program program_code" is reporting that the halting
  decider's error return is wrong, we can edit the should_return_error function
  inside halting_decider by inverting its boolean value.  We then obtain a new
  program called "program*". When we run "program* program_code" (new program,
  original program's code as argument), this will now confirm that the decider
  has returned a correct true or false value. In other words, the underlying
  two-valued decider (which does not change between the two programs) correctly
  decides whether "program program_code" halts! It is specifically the error
  decision in the three-valued decider that was shown wrong by the "program
  program_code" test case.

  The goal of this proof is to show that this error decision is not computable,
  and that cases exist in which it errs both ways.  Since it is not computable,
  then a three-valued decider cannot reliably "protect" a two-valued decider
  from returning a true or false value.  Sometimes the error decision will
  incorrectly say "no error", and expose a halting decision which is wrong.
  Sometimes it will incorrectly say "error", wrong suppressing a correct
  return value from the two-valued decider.

References:

[1] _On computable numbers, with an application to the Entscheidungsproblem_,
    Alan Turing, Proceedings of the London Mathematical Society,
    Series 2, Volume 42 (1937), pp 230–265.

[2] _Mathematical Logic_, Stephen Cole Kleene, 1966,
    Chapter V: Computability and Decidability.