Saturday, December 11, 2010

Epistemological problems of purity

It would be nice if purity (of java functions) was a property that could be proved easily. But this is not so. On the contrary, purity is a concept about what a function may not do, and it is generally and on principle impossible to prove for a java function. Just think of a java function that calls an interface method on some field of this - this amounts to the question whether all possible implementations of this method are pure, including the implementations that will be written next week.

In the general case, we can not prove that a function has no side effects, throws no exceptions or does not return null. That is, we cannot confirm that the native declaration written by the Frege programmer is valid. However, in certain cases we can show that it must be invalid, or at least that it is most probably invalid. In addition, we should exploit the (rare?) cases where a positive proof can indeed confirm what the programmer asserted.

Side effects and Input/Output

We can show for many functions that they are not pure. We will regard the following as proof that a function is not pure, that is, it produces or consumes side-effects:
  • invocation of a native function will most likely perform input/output. In fact, as the JVM itself has no I/O instructions, this is the only way to perform input/output. Certainly, there might be pure functions that are written in C or even Assembler for performance reasons, but I think we can ignore them for the first approximation. In any case, the number of such functions will be sufficiently small.
  • altering fields in objects visible to the caller. An object is visible to the caller if it is this, or another argument of the function, or if it is referenced in a field of a visible object.
  • altering static fields or objects visible through such a field.
  • reading non final static fields or non-final objects visible through final fields
Of course, not only the actions of the function itself but also of called functions and functions called from these functions and so on must be considered. Note that it is allowed to create objects and alter them and even to return them. For example, this one should be okay:
    public int[] initarray(int s, int v) { 
        int[] a = new int[s]; 
        while (--s >=0) a[s] = v; return a; }

    Exceptions are not a big problem. The java compiler does not allow to call functions that throw checked exceptions if the call itself is not guarded by a try-catch statement. Hence, if the Frege programmer untruthfully asserts that a java function does not throw checked exceptions, the Frege compiler will simply produce a java source that will be rejected by the java compiler.
    It is no problem to check if a method throws checked exceptions, so we can as well do that in the Frege compiler.
    Regarding unchecked exceptions, one could argue that there must be a reason that they are unchecked.  If they are not caught, the program will terminate due to a severe error that couldn't be handled anyway. However, it should be quite easy to check whether a function could possibly throw an exception and warn the programmer who might be unaware of this fact. Or there could be a pedantic mode who flags this as error.

    Null references

    The Frege type system cannot protect the programmer from committing errors that are equivalent to dereferencing a null pointer in Java. Yet it can detect such cases and will emit warnings. A warning-free Frege program will thus never encounter a pattern match failure.
    Internally, null references simply do not exist and cannot be created through Frege code. But a native method,  whose declaration is contrafactual in this regard, could return null and hence compromise safety. Correct declaration of native methods that return objects is thus absolutely critical.
    I am in favor of being very conservative in this regard. If it is not absolutely evident that the method may never return null, a declaration with a non Maybe return type will be rejected.

    What if we get it wrong?

    The least impact has a wrong assertion about (unchecked) Exceptions. The program may terminate unexpectedly, but in the last consequence intensionally. Nevertheless, as said before, it would be nice to let the user of such a method know at compilation time.

    Funny things may happen if one manages to declare a function with side effects as pure. The compiler may apply all optimizations that are appropriate for really pure functions as it sees fit. In most cases, it is just not predictable what will happen. Consider the following:
    native print System.out.print :: String -> ()   
    main _ = fst (let ha = print "ho" in (ha,ha))
    The output might be "hohoho", "hoho", "ho" or nothing at all!
    There will also be harmless cases such as:
    native getenv System.getenv :: String -> Maybe String
    If one assumes an operating system where the process environment is read only, this function is actually pure.
    Anyway, it is generally a bad idea to assert purity when there is none. The semantics of the resulting program is undefined.

    The worst thing is the possible smuggling of  null values into the realm of the otherwise null-value-free Frege code. As always, the dereferencing of the null may happen hours after it has been stored in some data structure, and in totally unrelated parts of the program. It is cold comfort, that in Frege one needs to look only at suspicious native functions to find the error, while in a normal Java program the error could be literally everywhere.

    No comments:

    Post a Comment

    Comments will be censored by me as I see fit, most likely if they contain insults or propaganda for ideologies I do not like. Comments that are on topic will not be censored. If I leave a comment uncensored this does not imply that I agree with the opinions expressed therein.