- Type. A set of objects (values) and operations.
Objects in a general sense, need not be instantiations of a class
declaration.
- Static and Dynamic.
- Static types: Variables are declared with a type. Assignment converts
a value to a new type. C, C++, Java, Pascal, Ada;
most compiled languages.
- Dynamic types: Types stay with values, and assignment copies both value
and type. Lisp, Smalltalk, perl, Python, PHP, Ruby; most interpreted
languages.
- Type Propagation
- The type of each expression is determined by the types of its
sub-expressions.
- In a statically-type language, these rules can be evaluated at
compile time.
- Polymorphism with class inheritance is something of a hybrid.
- A Java reference to a base class may refer to a derived-class object.
(Likewise C++ pointers and references.)
- Assignment copies the type, but only types derived from the base class
are allowed.
- Static types make execution more efficient
- Must perform operations according to types.
- Declarations allow the choice one type at translation.
- Type Errors.
- Type error: any operation which is not defined for the type of data
it is applied to.
- Type system: precise definition of the type bindings, the types values,
and the legal operations on them.
- Strong typing: All type errors are detected by the system.
- Java is strongly-typed.
- C Is not.
- C++ attempts to strengthen C's type system.
- For static types, many type errors may be detected at compile time.
- Compile time is cheaper.
- Compile time is more reliable.
- Checking and Conversion
- Figure out what + in a + b means.
- Check that uses of variables agree with operations.
- Dynamic typing: Checking must be done at run time.
- Static typing: Checking may be done at compile time.
- Static can be more efficient by choosing operations once.
- Conversions and Coercions
- A type mis-match
- May just be illegal.
- May require a coercion: int + float.
- Conversions
- Explicit conversions: casts.
- Implicit conversions
- Should be limited to widening conversions.
- Many older languages (such as C) violate this rule.
- Basic Types.
- Usual hardware integers and floats.
- C and most languages leave sizes to implementer; Java specifies.
- Character sets. Has been ASCII (7 or 8 bit). Now Unicode (16 bit) or
UTF-8 (variable).
- Complex Types
- Some interpreted languages (Ruby, Python, Lisp) provide unbounded
integers, which are not hardware types.
- Enumerations.
- Pointers.
- Most interesting is linked structures, where pointers point to
objects which contain pointers.
- Creation.
- Pointers to allocated objects.
- Some languages (C) allow creation of pointers to normal
variables.
- Garbage collection.
- Most compiled languages don't have it, except Java.
- Most interpreted languages do.
- Compound Types.
- Compound types are built of objects of other types.
- Arrays.
- Members are laid out contiguously in memory.
- Access to a particular member is by offset calculation.
- Generates to however may dimensions desired.
- Subscripts (selectors) are numbers.
- Minimum subscript.
- C, C++, Java, many others: always 0.
- FORTRAN, Smalltalk: always 1.
- Pascal, Ada, others: user-defined.
- Subscript values may be variables computed at run time.
- Typical case allocates the array as a block. Component addresses
are computed from subscripts.
- Java makes multi-dimensional arrays by allocating an array of array
references.
- Needs only the one-dimensional
location formula.
- Repeat for each dimension.
- Dope vector (or array descriptor).
- A list of the data needed to describe the array.
- Might be the number of dimensions and constant terms from the
location
computation formula.
- Might be α and bounds pair for each dimension (allows for
checking.)
- Might be both: the constants (for finding the lvalue) and the bounds
(for checking).
- It is also possible to create “fake” dope vectors to represent
portions of larger arrays.
- The dope vector may be allocated with the vector and stored at
its front, or it may be allocated separately.
- Your book includes the content type, which would only be needed for
dynamic types, where it would probably not be useful, since those
arrays are not usually homogenous.
- Homogeneity
- Most compiled languages require array members to be the same type
(homogeneous), so all the slots are the same size.
- Most interpreted languages allow various types (heterogeneous).
- Usually implemented by storing references to the values
in the array, rather than the value itself.
- The values may be of different sizes, but the references
will all be the same, so the location formula still works.
- Slicing.
- Allow a portion of an array to be treated as an array.
Python: a[3:8]
Ruby: a[3..7]
- Can often be implemented by crafting a special dope vector
without copying the array.
- Strings.
- Early languages had little string support; some in COBOL.
- Plain C supports arrays-as-strings. Brain damaged.
- Pascal/Ada family have fixed-length strings.
- Java has nice semi-builtin variable-length strings.
- C++ has a nice variable-length string class.
- It's interesting that it took so long to get here.
- Structs or records.
- Selectors are field names; not computable at run time.
- Like classes without methods or inheritance.
- Heterogeneous.
- Functional Types.
- Functions passed as parameters.
- Some languages allow functions to be passed as parameters. FORTRAN,
Pascal, C.
- Example uses.
- Plotting an arbitrary function.
- Finding roots of an arbitrary function.
- Comparisons for sorting or other data structures.
- qsorter.c
- In C/C++, this is a special case of a pointer-to-function type.
- Java uses interfaces for this purpose.
- Anonymous functions.
- Lisp lambda expressions.
- Many scripting languages: perl sub operator, Python lambda,
Ruby proc object.
- Java lambda expression (Java 8)
- (string z) -> { int x = x + 1; System.out.println(z + x);}
- (int x) -> x * x
- Is a closure.
- Return type generally inferred. Parameter types may be as well.
- Type is technically a kind of Runnable object.
- C++ anonymous function (C++ 11)
- [](int x) -> { return x * x }
- [z](int x) -> int { return z + x * x }
- Return type can be inferred.
- The square brackets are variables captured from the
creation context. These are the only globals it may use.
- Is not technically a closure, since it only brings what you
capture, and those can dangle.
- Frighteningly, the 2020 draft standardard seems to allow template
anonymous functions. My head hurts.
- Dynamically-created functions.
- Lisp lambda expressions.
- Many scripting languages, usually through an eval function.
- Type Equivalence
- Structural equivalence.
- Name equivalence
Name most common. C uses structural for arrays, but only for
parameter passing since you can't assign arrays.
- Subtypes
- A type with constraints on its values.
subtype Degrees_Arc is Integer range 0..360;
- Inheritance can be viewed as a form of this.
- Generics.
- Array-based Stack Examples.
- Ada: Generic package,
Implementation,
User.
- C++: Template class,
User.
- Java: Generic class,
User.
- Ada and C++ allow template constant intgers as parameters.
- Using generic types.
- Ada and Java require type parameters to be constrained. For instance,
- In Ada, you have to say it's an array of something if you
want to subscript in the package.
- in Java you have to say it extends Comparable if you want to
run compareTo on it.
- Checks uses of the type inside the package or class against
the constraints.
- Checks that types sent on use comply with the constraints.
- C++
- Allows some limited forms of constraint, but mostly it waits
to find what concrete type you send, and complain if it fails.
- Compiles the generic class assuming the parameter types can do
whatever you do to them.
- Complains when you send a type that can't do that.
- Private types.
- See above Ada examples.
- Don't need it when you have classes.
- Type Inference.
- Some languages will assign static types based on use.
- auto
- A recent and simple form is the auto keyword of C++
post 2011.
- auto x = 20;
- auto i = some_map.find("nimrod");
- auto f(int a, int b) { return 3*a + 2*b; }
- The type is just the type of the expression.
- Most useful when the type is complicated or difficult to find.
- Technique is older and was developed in static functional
languages.
- Functions in ml don't even have a syntax to declare return type; it is
inferred from the return expression.