Hindley Milner algorithm in Java

I'm writing a system based on simple data flow (imagine it like a LabVIEW editor / runtime written in Java) Users can connect blocks together in the editor. I need type inference to ensure that the data flow diagram is correct. However, most types of inference examples are written in mathematical symbols, ML, Scala, Perl, etc. I won't say "

I read about Hindley Milner algorithm and found a good example of this document It works in a set of T1 = T2 constraints However, my data flow graph is converted to T1 > = T2 image constraint (or T2 extended T1 or covariance, or T1 < = T2, which I see in various articles) No Lambdas, just input variables (used in generic functions, such as t merge (t in1, t in2)) and specific types Summarize HM algorithm:

Type = {TypeVariable,ConcreteType}
TypeRelation = {LeftType,RightType}
Substitution = {OldType,NewType}
TypeRelations = set of TypeRelation
Substitutions = set of Substitution

1) Initialize TypeRelations to the constraints,Initialize Substitutions to empty
2) Take a TypeRelation
3) If LeftType and RightType are both TypeVariables or are concrete 
      types with LeftType <: RightType Then do nothing
4) If only LeftType is a TypeVariable Then
    replace all occurrences of RightType in TypeRelations and Substitutions
    put LeftType,RightType into Substitutions
5) If only RightType is a TypeVariable then
    replace all occurrences of LeftType in TypeRelations and Substitutions
    put RightType,LeftType into Substitutions
6) Else fail

How to change the original HM algorithm to deal with these relations instead of simple equality relations? Examples or explanations of Java ish will be appreciated

Solution

I read at least 20 articles and found one (Francois Pottier: introducing subtypes: type reasoning from theory to practice). I can use:

Input:

Type = { TypeVariable,ConcreteType }
TypeRelation = { Left: Type,Right: Type }
TypeRelations = Deque<TypeRelation>

Assistant functions:

ExtendsOrEquals = #(ConcreteType,ConcreteType) => Boolean
Union = #(ConcreteType,ConcreteType) => ConcreteType | fail
Intersection = #(ConcreteType,ConcreteType) => ConcreteType
SubC = #(Type,Type) => List<TypeRelation>

If the first extension is equal to or equal to the second extension, extendsorequals can tell two specific types, for example, (string, object) = = true, (object, string) = = false

If possible, the Federation calculates the common subtypes of two specific types, for example, (object, serializable) = = Object & serializable, (integer, string) = = null

Intersection calculates the closest supertype of two specific types, for example (list, set) = = collection, string) = = object

Subc is a structure decomposition function. In this simple case, it only returns a singleton list of new typerelation containing its parameters

Tracking structure:

UpperBounds = Map<TypeVariable,Set<Type>>
LowerBounds = Map<TypeVariable,Set<Type>>
Reflexives = List<TypeRelation>

The upperbounds trace may be a supertype of a type variable, and the lowerbounds trace may be a subtype of a type variable Reflectors track the relationship between pairs of type variables to help the bounded rewriting of the algorithm

The algorithm is as follows:

While TypeRelations is not empty,take a relation rel

  [Case 1] If rel is (left: TypeVariable,right: TypeVariable) and 
           Reflexives does not have an entry with (left,right) {

    found1 = false;
    found2 = false
    for each ab in Reflexives
      // apply a >= b,b >= c then a >= c rule
      if (ab.right == rel.left)
        found1 = true
        add (ab.left,rel.right) to Reflexives
        union and set upper bounds of ab.left 
          with upper bounds of rel.right

      if (ab.left == rel.right)
        found2 = true
        add (rel.left,ab.right) to Reflexives
        intersect and set lower bounds of ab.right 
          with lower bounds of rel.left

    if !found1
        union and set upper bounds of rel.left
          with upper bounds of rel.right
    if !found2
        intersect and set lower bounds of rel.right
          with lower bounds of rel.left

    add TypeRelation(rel.left,rel.right) to Reflexives

    for each lb in LowerBounds of rel.left
      for each ub in UpperBounds of rel.right
        add all SubC(lb,ub) to TypeRelations
  }
  [Case 2] If rel is (left: TypeVariable,right: ConcreteType) and 
      UpperBound of rel.left does not contain rel.right {
    found = false
    for each ab in Reflexives
      if (ab.right == rel.left)
        found = true
        union and set upper bounds of ab.left with rel.right
    if !found 
        union the upper bounds of rel.left with rel.right
    for each lb in LowerBounds of rel.left
      add all SubC(lb,rel.right) to TypeRelations
  }
  [Case 3] If rel is (left: ConcreteType,right: TypeVariable) and
      LowerBound of rel.right does not contain rel.left {
    found = false;
    for each ab in Reflexives
      if (ab.left == rel.right)
         found = true;
         intersect and set lower bounds of ab.right with rel.left
    if !found
       intersect and set lower bounds of rel.right with rel.left
    for each ub in UpperBounds of rel.right
       add each SubC(rel.left,ub) to TypeRelations
  }
  [Case 4] if rel is (left: ConcreteType,Right: ConcreteType) and 
      !ExtendsOrEquals(rel.left,rel.right)
    fail
  }

A basic example:

Merge = (T,T) => T
Sink = U => Void

Sink(Merge("String",1))

The relationship of this expression:

String >= T
Integer >= T
T >= U

1.) rel is (string, t); Case 3 is activated Because the reflexion is empty, t's lowerbounds is set to string There are no upperbounds for T, so typerelationships remain unchanged

2.) rel is (integer, t); Case 3 is activated again The reflexion is still empty. The lower limit of T is set to the intersection of string and integer to generate an object. There is still no upper limit of T, and there is no change in typerelations

3.) rel is t > = U. case 1 is activated Because reflexion is empty, the upper limit of T is combined with the upper limit of u, and this upper limit remains empty Then, the lower limit u is set to the lower limit t, and Object > = U. typerelation (T, U) is added to the reflector

4.) algorithm termination From boundary object > = t and Object > = u

In another example, a type conflict is shown:

Merge = (T,T) => T
Sink = Integer => Void

Sink(Merge("String",1))

Relationship:

String >= T
Integer >= T
T >= Integer

Step 1.) And 2.) Same as above

3.) rel is t > = U. case 2 is activated In this case, the attempt to parallelize the upper limit of T (object at this time) with integer fails, and the algorithm fails

Extension of type system

Adding common types to the type system needs to be extended in the main cases and Subc functions

Type = { TypeVariable,ConcreteType,ParametricType<Type,...>)

Some ideas:

>If the concretetype and parametrictype match, it is an error. > If TypeVariable and parametrictype meet, such as t = C (U1,..., UN), create new type variables and relationships, such as T1 > = U1, TN > = UN, and work with them. > If two parametrictypes meet (D and C), check whether the number of d > = C and type parameters is the same, and then extract each pair as the relationship

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>