Retrievers
----------

Retrievers are modules that can download or otherwise retrieve files[1], 
including other modules and Packages files. It is expected that at least the
following types will eventually be implemented:

- cd retriever
	Ensures an install cd is mounted, and finds files on it. No real
	retrieving done.
- media retriever
	Handles all sorts of external media, including floppies.
	May or may not need to deal with files split across multiple
	floppies.
- hard disk retriever
	Gets a file from a local hard disk (dos, linux, etc).
- http retriever
	Downloads a file from a remote http mirror.
- ftp retriever (needed?)
	Downloads a file from a remote ftp mirror.
- nfs retriever
	Gets a file via nfs.

A retriever must be able to do the following:

- Provides: retriever
- Set itself up so it can access the cdrom, mirror, or whatever.
- Get a list of the modules that are available. This will be done by
  retrieving a standard Packages file.
- Retrieve a file.

A separate module will take care of parsing the Packages files, and
checking md5sums of downloaded modules, and a separate program
(udpkg) will handle actually installing modules.

Since retrievers are debian packages, they can have a postinst that is
run to set them up. During setup, they may need to prompt the user for
information (using debconf). For example, a nfs retriever will probably
need to prompt the user for a nfs server name.

Retrievers may also need to depend on additional modules. For example the
http/ftp/nfs retrievers will need to depend on a module to set up networking,
while the cdrom retriever may need to depend on module(s) that include kernel
modules for the iso9660 filesystem and cd drivers.

Retrievers may also need to interact with the user when they are retrieving
a file. For example, the media retriever will need to prompt for disks.
This interaction will also be accomplished via debconf.

Retriever interface
-------------------

All retrievers should support the following interface.

[ This interface is preliminary and depends on the filesystem layout and
ramdisk allocation scheme used by the installer, and so the locations may
change. ]

/usr/lib/debian-installer/retriever/<retriever name> is an executable (the
name of the executable must be the same as the name of the module that
contains it). It is run with the following parameters:

retriever [command] [args]

Currently, there are four commands:

retriever config
        Perform retriever-specific configuration. This may include creating
        a /var/cache/anna/include file that contains a list of udebs to be
        installed automatically.

retriever retrieve [source-filename] [dest-filename]
	Download the specified file to the specified destination.
	The source-filename is the Filename field of a Packages file.

retriever packages [dest-filename]
	Download a Packages file, or a Packages-like file. The fields used by
	anna are:
	- Package
	- Version
	- Filename
	- Priority
	Whether other fields are needed is currently undecided. A retriever
        now must support this command, because it is supposed to download and
        parse a Release file if appropriate. This means that the result of
        this command can (and often will) be a concatenation of several
        Packages files from the install medium.

	In the step the retriever can also retreive udeb_includes and
	udeb_excludes files.

retriever error failing_command
	This command can be called if one of the other commands seems to
	have failed, perhaps by returning bad data.

        The failing_command field tells the retriever command that
	previously failed. For example, it might be "packages" is package
	retrieval failed, or the Packages file cannot be parsed.

	The retriever can take any necessary actions to recover (or ask the
	user if they want to retry, if the medium is unreliable or absent).
	If the recovery succeeds, it should exit 0. If recovery failed, it
	should display an informative error message if necessary, and exit 2.

	If a retreiver does not implement this command, it can exit 1,
	but all retreivers should support it now.

retriever cleanup
	Clean up after a retriever round. This is used by e.g. the media
	retriever to unmount the floppy.

If the destination file already exists, the retriever may optionally attempt
to continue a download.

Once run, retrievers are free to prompt the user, or do whatever else is
necessary to retrieve the file. They should normally display some indication
of progress, using the PROGRESS extension to debconf.

Note that this simple interface requires that retrievers know the location
of the base directory of the archive. This will not be a problem for many 
-- the cdrom retriever simply uses whatever is available, in a hard-coded
location. Other retrievers may require more configuration. For example, the
http retriever will need to know what version of Trisquel is being installed,
what site to install from, and probably what version of the debian-installer
is being used as well. The user will likely have to be prompted for some of
this data during retriever setup.

Retriever controller (anna)
---------------------------

This is another module, that makes use of the retrievers. It doesn't
actually depend on any though. What it does is, when it is picked from the
menu, it builds a list of all available retrievers, and has the user pick
one from the list. That retriever is then configured, and is used to 
retrieve whatever seems appropriate. Thus, the menu item can be picked
multiple times, using a different retriever every time. And if a retriever
downloads a second retriever, it will show up in future displays of the
retrievers menu.

The default retriever (if more than one is available, and it's the first
time the question is asked) will be, in turn and if they exist:
 1. net-retriever (if you've got connectivity you probably want to use it)
 2. cdrom-retriever (using a CD is also very common)
 3. media-retriever (we usually don't have to use the floppy)
 (4. file-retriever? not currently used, but if resurrected, should
                     probably have lower priority than the floppy
                     retriever)

As that is sometimes not flexible enough, it's also possible to create
udebs that provide menu items that only use a specific retriever. They do
do by calling anna, passing it the name of the retriever to use.

Which modules are needed is an interesting question. Our current
implementation takes all modules, except those of extra priority.
Dependencies are taken into account, and there is a special hack to avoid
loading mismatched kernel packages too. If the retriever provides them in
/var/cache/anna/, files udeb_include and udeb_exclude can be used to fine
tune what packages are installed by default. All packages listed in the
udebs_include will be installed, but none from udebs_exclude.

The controller also needs to verify md5sums of downloaded files.

Footnotes
---------

[1] Conceptually, rather similar to apt's methods, although the interface
    is a lot more simplistic and we don't need to support things like
    pipelining multiple downloads.
