[FoRK] Programming lang etc. (details for Stephen, comment for JAR)
jbone at place.org
Fri Nov 13 12:44:11 PST 2009
Re: JAR: exactly. Understood, agreed.
> Could it be our best course would be to build the things that build
> the code, or even build the things that will then assemble
> themselves to build the things to build the code?
And that's a big part of the motivation for better data languages,
too. W/o them, it's hard to get the recursion going and sustained ---
any given technology that makes some tradeoffs (usually false, cf.
below; most recent example being e.g. XML) gets so baked into things
that humans need this giant exoskeleton of tools and crap to deal with
the mess --- which is so gorpy anyway that event the machines have
various problems w/ it as well. Everybody loses.
Basically there are three general use cases for such things: human-to-
human (either different humans or same-human, over either space or
time), human-machine (config files, output files for human
consumption, etc.) and machine-to-machine (most markup scenarios,
realistically speaking; OTW protocols and serialization formats,
etc.) I contend that a big part of the problem is the baked-in
assumption that you have to optimize on one or at most two of these.
OGDL, YAML, various wiki markups, UNIX cookie jars and record jars,
and other examples abound to the contrary. And the biggest problem
faced in any of these scenarios today, IMHO, is the lack of type-
safety in representation coupled with tenability in the reading and
writing dimensions. Common wisdom would have it that you can't have
your lunch and eat it too, particularly w/ tradeoffs in parser
complexity (as in, inherent computational complexity) --- but I think
we've got far better potential state-of-the-art at present than we're
seeing used anywhere...
Regarding your "Multiarity(tm)" etc... loved it! Thanks, Tom. :-)
You're spot on.
Dr. Ernie writes:
> where everything is a string
Just to be clear, that's the *opposite* of what I'm talking about.
I'd prefer an environment where *nothing* is a string (except *actual*
strings.) Everything's a well-typed value and *very* few if any
interesting data types have to be "tunneled" inside strings. But
those well-typed values can be explicitly constructed and
unambiguously inferred from the lexical syntax involved.
> which implies using sigils for variables
In general this isn't really a problem just with shell languages, it's
a problem with any language that admits symbols-as-value-types. In
such languages you appear to have a strict choice (with a few
exceptions, to be discussed below) --- either symbols are unquoted and
unevaluated by default, and must be explicitly dereferenced somehow to
get the value (if any) they might be bound to in some context, OR you
have to quote them in order to use them as values in themselves. (Or,
you can punt and just have strings, which is what all but a very few
languages do.) For the most part you can't have it both ways.
You can get away from that in some limited context by having some
special evaluation rules. Schemes and all UNIX shells have a useful
convention: the first symbol or subexpression in an expression is
taken to be a variable referencing (function returning, etc.) a
function, and is implicitly dereferenced and applied to the arguments
(which in Scheme are just expressions that are eagerly evaluated,
while in the shell they are only lightly parsed and flat and
dereferencing is explicit.) But such evaluation semantics along with
the syntactic ability to distinguish expressions / commands / whatever
the bigger-than-word language unit is, gives you a tool to use as a
language designer. (Nb. note that Scheme achieves something
interesting by allowing either a symbol or a functor in first
position; this leads, with a little thought, to a really interesting
gestalt: the semantics of programming language-style variables (named
slots, as opposed to e.g. mathematical or logical variables, etc.) in
general can be understood in terms of functors.)
The generalization of symbols to hierarchical constructs that are
still simultaneously both names and first-class objects --- let's call
them "path expressions" --- is a pretty interesting thing. Consider
the following in a typical object language of some kind:
This should generally be understood as "look up the value of the name
c in the namespace obtained by looking up the value of the name b in
the namespace obtained by looking up the value of the name a in the
(global, local, depending on context) namespace." Dereference,
typically, is implicit. Consider the similarity to the familiar
What's the difference? Well, for one thing, in shell-like languages
we can construct the latter, pass it around, etc. in shells w/o
assuming that it's going to be dereferenced at any given point and /
or yield anything particular. To be fair it's because the shell only
treats it as an opaque string (modulo things like dirname and its path-
munging shell shortcut friends) but there's no reason why we can't
think about such things as objects in their own right.
This leads to a really interesting set of potential evaluation rules
that minimize (but don't entirely eliminate) the kind of dollar-itis
that you find in most shell languages. And FWIW, the first characters
of each of these:
Can all be understood as special dereference operators that name a
unique context in which the symbol foo is to be dereferenced.
So to be clear: I'm *not* a fan of the sigils and crap syntactic line
noise that you find all over the place in e.g. most shells and in Perl
etc. That's actually *exactly* what I'd like to minimize! But in an
interactive context, and with first-class symbols and other value
types, it's unlikely that you can eliminate (at least) the use of e.g.
"$" as a prefix dereference operator when you want to get the value
that's "bound to" or implied by certain value-holding types that are
values in themselves.
More information about the FoRK