[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

<val-tab>




Andrew Sutherland writes:
>
> - Create a class <val-tab> that uses hash as its hash function and =
>   as its equivalency function. (There's no reason for users to have to
>   create their own table subclasses.)

Here's a strawman implementation of this. It's short, easy, and works
well, except that mutable collections don't seem to work quite ok.
Commentary in the comments:

;; ==== START of val-tab.goo ====
;;
;; Neel Krishnaswami <neelk@alum.mit.edu>
;; This code is in the public domain.
;; 

(use goo)

(dc <val-tab> (<tab>))

;; [val-hash] is the generic you add methods to in order to extend the
;; set of values that <val-tab> can successfully store. It's the only
;; new binding aside from <val-tab>. 

(dg val-hash (x|<any> => <int>))

(dm key-test (x|<val-tab> => <fun>)
  =)

(dm tab-hash (x|<val-tab> => <fun>)
  val-hash)

;; Now, let's add some methods to the val-hash generic. I'm basically
;; just going to go through the list of types in the goo stdlib and
;; add a method.
;;
;; The default method is to just use id-hash, because objects compared
;; with = default to == if it's not defined. This method also catches
;; most of the immutable scalar types: <int>, <log>, <sym>, <type>, and
;; <chr> will all just work, because there's no way to create multiple
;; #t or 17 objects. Objects like ports and functions hash by identity,
;; which means that they will "just work" too.
;;
;; One question that I don't know enough to answer is how types should
;; hash. Do two types <a> and <b>, such that
;;   (and (subtype? <a> <b>) (subtype? <b> <a>)) => #t
;; have it true that (= <a> <b>) => #t. If so, can you have two types
;; that are equal but have differing in-memory representations (say
;; built using the various t-foo constructors)? 

(dm val-hash (x|<any> => <int>)
  (id-hash x))

;; Now we add a method for collections.
;;
;; I've added no method to let you use mutable collections as keys to
;; collections, because if you make any changes to them, then the
;; collection's hash code will change and the <val-tab> will get
;; hopelessly confused. 
;; 
;; This breaks the law (= x y) <==> (= (val-hash x) (val-hash y)).
;; This is deeply annoying, but I see no way to preserve mathematical
;; correctness with safety for users. The language fascist in me says
;; that this just means that mutable <col> types should compare with
;; ==, but that's just cruel to users.

;; Also, if you add another immutable sequence type, then strings and
;; sequences of <chr> will be treated the same. I think this is the
;; correct behavior, but some users may find it surprising. 

(dm val-hash (c|<col.> => <int>)
  (def acc 3821)
  (for (((tup k v) c))
    (def k-hash (val-hash k))
    (def v-hash (val-hash v))
    ;; This code assumes overflow. If goo changes to transparent
    ;; bignums change this too. The fn is from Dan Bernstein; I hope
    ;; it works okay with 30-bit integers rather than 32....
    (set acc (+ (* 33 acc) k-hash))
    (set acc (+ (* 33 acc) v-hash)))
  acc)

;; Are there any other classes that need to get special val-hash
;; methods? I can't think of any, which is cool.

(export <val-tab>)
(export val-hash)

;; ==== END of val-tab.goo ====

-- 
Neel Krishnaswami
neelk@alum.mit.edu