THE TAO OF OPTION PARSING
=========================

Optik is an implementation of what I have always considered the most
obvious, straightforward, and user-friendly way to design a user
interface for command-line programs.  In short, I have fairly firm ideas
of the Right Way (and the many Wrong Ways) to do argument parsing, and
Optik reflects many of those ideas.  This document is meant to explain
this philosophy, which in turn is heavily influenced by the Unix and GNU
toolkits.


Terminology
-----------

First, we need to establish some terminology.

  argument
    a chunk of text that a user enters on the command-line, and that the
    shell passes to execl() or execv().  In Python, arguments are
    elements of sys.argv[1:].  (sys.argv[0] is the name of the program
    being executed; in the context of parsing arguments, it's not very
    important.)  Unix shells also use the term "word".

    It's occasionally desirable to substitute an argument list other
    than sys.argv[1:], so you should read "argument" as "an element of
    sys.argv[1:], or some other list provided as a substitute for
    sys.argv[1:]".

  option   
    an argument used to supply extra information to guide or customize
    the execution of a program.  There are many different syntaxes for
    options; the traditional Unix syntax is "-" followed by a single
    letter, eg. "-x" or "-F".  The GNU project introduced "--" followed
    by a series of hyphen-separated words, eg. "--file" or "--dry-run".
    These are the only two option syntaxes provided by Optik.

    Some other option syntaxes that the world has seen include:
      * a hyphen followed by a few letters, eg. "-pf"
      * a hyphen followed by a whole word, eg. "-file"
      * a plus sign followed by a single letter, or a few letters,
        or a word, eg. "+f", "+rgb"
      * a slash followed by a letter, or a few letters, or a word, eg.
        "/f", "/file"
    These are not supported by Optik, and they never will be.

  option argument
    an argument that follows an option, is closely associated with that
    option, and is consumed from the argument list when the option is.
    Often, option arguments may also be included in the same argument as
    the option, eg.
      ["-f", "foo"]
    may be equivalent to
      ["-ffoo"]
    (Optik supports this syntax.)

    Some options never take an argument.  Some options always take an
    argument.  Lots of people want an "optional option arguments"
    feature, meaning that some options will take an argument if they see
    it, and won't if they don't.  This is somewhat controversial,
    because it makes parsing ambiguous: if "-a" takes an optional
    argument and "-b" is another option entirely, how do we interpret
    "-ab"?  Optik does not currently support this.

  positional argument
    something leftover in the argument list after options have been
    parsed, ie. after options (and option arguments) have been parsed
    and removed from the argument list.

  required option
    an option that must be supplied on the command-line; the phrase
    "required option" is an oxymoron and I personally consider it poor
    UI design.  Optik doesn't prevent you from implementing required
    options, but doesn't give you much help at it.

For example, consider this hypothetical command-line:

  prog -v --report /tmp/report.txt foo bar

"-v" and "--report" are both options.  Assuming the --report option
requires an argument, "/tmp/report.txt" is an option argument.  "foo"
and "bar" are positional arguments.


What are options for?
---------------------

Options are used to provide extra information to tune or customize the
execution of a program.  In case it wasn't clear, options are usually
*optional*.  A program should be able to run just fine with no options
whatsoever.  (Pick a random program from the Unix or GNU toolsets.  Can
it run without any options at all and still make sense?  The only
exceptions I can think of are find, tar, and dd -- all of which are
mutant oddballs that have been rightly criticized for their non-standard
syntax and confusing interfaces.)

Lots of people want their programs to have "required options".  Think
about it.  If it's required, then it's *not optional*!  If there is a
piece of information that your program absolutely requires in order to
run successfully, that's what positional arguments are for.  (However,
if you insist on adding "required options" to your programs, look in the
examples/ directory of the source distribution for two ways of
implementing them with Optik.)

Consider the humble "cp" utility, for copying files.  It doesn't make
much sense to try to copy files without supplying a destination and at
least one source.  Hence, "cp" fails if you run it with no arguments.
However, it has a flexible, useful syntax that does not rely on options
at all:

    cp SOURCE DEST
    cp SOURCE ... DEST-DIR

You can get pretty far with just that.  Most "cp" implementations
provide a bunch of options to tweak exactly how the files are copied:
you can preserve mode and modification time, avoid following symlinks,
ask before clobbering existing files, etc.  But none of this distracts
from the core mission of "cp", which is to copy one file to another, or
N files to another directory.


What are positional arguments for?
----------------------------------

In case it wasn't clear from the above example: positional arguments are
for those pieces of information that your program absolutely, positively
requires to run.

A good user interface should have as few absolute requirements as
possible.  If your program requires 17 distinct pieces of information in
order to run successfully, it doesn't much matter *how* you get that
information from the user -- most people will give up and walk away
before they successfully run the program.  This applies whether the user
interface is a command-line, a configuration file, a GUI, or whatever:
if you make that many demands on your users, most of them will just give
up.

In short, try to minimize the amount of information that users are
absolutely required to supply -- use sensible defaults whenever
possible.  Of course, you also want to make your programs reasonably
flexible.  That's what options are for.  Again, it doesn't matter if
they are entries in a config file, checkboxes in the "Preferences"
dialog of a GUI, or command-line options -- the more options you
implement, the more flexible your program is, and the more complicated
its implementation becomes.  It's quite easy to overwhelm users with too
much flexibility, so be careful there.
