Static Linkers

Upshot: modern languages should not bother trying to interface with the operating system's static linker (the dynamic linker is a different story).

Most operating systems define the format for dynamically linked libraries (.dll's, .so's, .dylibs, etc). This makes sense, since one frequently needs to dynamically link code written in one language into a program written in another; therefore the operating system (rather than the language or compiler) is the natural choice for setting the rules.

I'm not convinced that the same argument works for statically linkage (.a, .o). Notably, Java completely eschews support for statically linked libraries – and I think it is much better off as a result.

Static linkage exists mainly because of separate compilation, which in turn exists for two reasons:

  1. Programmers don't want to recompile their whole program when they change only part of it.

  2. Many compiler algorithms are superlinear in the size of the unit being compiled.

The results of the separate compilations are combined using a static linker. Most languages use the operating system's static linker, and therefore must use the operating system's output-of-separate-compilation format. In my opinion this has caused massive headaches for sophisticated languages like, say, Haskell. Haskell has so much information it needs to store in the output-of-separate-compilation result (types, etc) that it produces an entire extra file (the .hi file) alongside the .o file. This litters the filesystem with junk and leads to a mess of problems when these files get out of sync.

Java dodged this completely by simply defining its own output-of-separate-compilation format (.class files) and declaring that any linking of Java programs to non-Java programs needed to be done via dynamic linkage. This was a great decision and freed Java from the numerous and obnoxious limitations of .o-file formats designed in the 1970's for languages like C.

Side note: I suppose that technically Java does its “static linking” at runtime because it is a JITted platform, but this is a largely orthogonal point. GCJ, for example, can “statically link” a bunch of .class files into an executable.

An Alternative

There is a “third way” that I haven't seen implemented yet: have the language define its own output-of-separate-compilation format, and provide tools that translate both to and from the operating system's output-of-separate-compilation format. The tool which translates back will almost certainly need some sort of handwritten “interface file” to explain how assembler-level symbols should be exposed in the higher level language, so this pair of tools won't “round-trip” without additional effort. But it would satisfy those few people who need static-linkage speed (or simplicity) in multi-language projects.

It would probably make sense for this tool to work on both static and dynamically linked libraries – e.g. for Haskell, export a compiled Haskell library as either a .o or .so and (with the help of an interface file) import a non-Haskell library from either an .o or .so.