Tribblix: manual page: ncftpspooler.1

ncftpspooler(1) User Commands ncftpspooler(1)

NAME

ncftpspooler - Global batch FTP job processor daemon

SYNOPSIS

ncftpspooler -d [options]

ncftpspooler -l [options]

OPTIONS

Command line flags:
-d Begin background processing of FTP jobs in the designated FTP
job queue directory.

-q XX Use this option to specify a directory to use as the FTP job
queue instead of the default directory, /var/spool/ncftp.

-o XX Use this option to specify a filename to use as the log file.
By default, (and rather inappropriately) the program simply
uses a file called log in the job queue directory. If you
don't want a log, use this option to specify /dev/null.

-l Lists the contents of the job queue directory.

-s XX When the job queue is empty, the program sleeps 120 seconds
and then checks again to see if a new job has been submitted.
Use this option to change the number of seconds used for this
delay.

DESCRIPTION

The ncftpspooler program evolved from the ncftpbatch program. The
ncftpbatch program was originally designed as a ``personal FTP
spooler'' which would process a single background job a particular
user and exit when it finished; the ncftpspooler program is a
``global FTP spooler'' which stays running and processes background
jobs as they are submitted.

The job queue directory is monitored for specially-named and
formatted text files. Each file serves as a single FTP job. The
name of the job file contains the type of FTP job (get or put), a
timestamp indicating the earliest the job should be processed, and
optionally some additional information to make it easier to create
unique job files (i.e. a sequence number). The contents of the job
files have information such as the remote server machine to FTP to,
username, password, remote pathname, etc.

Your job queue directory must be readable and writable by the user
that you plan to run ncftpspooler as, so that jobs can be removed or
renamed within the queue.

More importantly, the user that is running the program will need
adequate privileges to access the local files that are involved in
the FTPing. I.e., if your spooler is going to be processing jobs
which upload files to remote servers, then the user will need read
permission on the local files that will be uploaded (and directory
access permission the parent directories). Likewise, if your spooler
is going to be processing jobs which download files, then the user
would need to be able to write to the local directories.

Once you have created your spool directory with appropriate
permissions and ownerships, you can run ncftpspooler -d to launch the
spooler daemon. You can run additional spoolers if you want to
process more than FTP job from the same job queue directory
simultaneously. You can then monitor the log file (i.e., using
tail -f ) to track the progress of the spooler. Most of the time it
won't be doing anything, unless job files have appeared in the job
queue directory.

JOB FILE NAMES

When the ncftpspooler program monitors the job queue directory, it
ignores any files that do not follow the naming convention for job
files. The job files must be prefixed in the format of
X-YYYYMMDD-hhmmss where X denotes a job type, YYYY is the four-digit
year, MM is the two-digit month number, DD is the two-digit day of
the month, hh is the two-digit hour of the day (00-23), mm is the
two-digit minute, and ss is the two-digit second. The date and time
represent the earliest time you want the job to be run.

The job type can be g for a get (download from remote host), or p for
aput (upload to remote host).

As an example, if you wanted to schedule an upload to occur at 11:45
PM on December 7, 2001, a job file could be named

p-20011207-234500

In practice, the job files include additional information such as a
sequence number or process ID. This makes it easier to create unique
job file names. Here is the same example, with a process ID and a
sequence number:

p-20011207-234500-1234-2

When submitting job files to the queue directory, be sure to use a
dash character after the hhmmss field if you choose to append any
additional data to the job file name.

JOB FILE CONTENTS

Job files are ordinary text files, so that they can be created by
hand. Each line of the file is a key-pair in the format
variable=value, or is a comment line beginning with an octothorpe
character (#), or is a blank line. Here is an example job file:

# This is a NcFTP spool file entry.
job-name=g-20011016-100656-008299-1
op=get
hostname=ftp.freebsd.org
xtype=I
passive=1
remote-dir=pub/FreeBSD
local-dir=/tmp
remote-file=README.TXT
local-file=readme.txt

Job files are flexible since they follow an easy-to-use format and do
not have many requirements, but there are a few mandatory parameters
that must appear for the spooler to be able to process the job.

op The operation (job type) to perform. Valid values are get
and put.

hostname
The remote host to FTP to. This may be an IP address or a
DNS name (i.e. ftp.example.com).

For a regular get job, these parameters are required:

remote-file
The pathname of the file to download from the remote server.

local-file
The pathname to use on the local server for the downloaded
file.

For a regular put job, these parameters are required:

local-file
The pathname of the file to upload to the remote server.

remote-file
The pathname to use on the remote server for the uploaded
file.

For a recursive get job, these parameters are required:

remote-file
The pathname of the file or directory to download from the
remote server.

local-dir
The directory pathname to use on the local server to contain
the downloaded items.

For a recursive put job, these parameters are required:

local-file
The pathname of the file or directory to upload to the remote
server.

remote-dir
The directory pathname to use on the remote server to contain
the uploaded items.

The rest of the parameters are optional. The spooler will attempt to
use reasonable defaults for these parameters if necessary.

user The username to use to login to the remote server. Defaults
to ``anonymous'' for guest access.

pass The password to use in conjunction with the username to login
to the remote server.

acct The account to use in conjunction with the username to login
to the remote server. The need to specify this parameter is
extremely rare.

port The port number to use in conjunction with the remote
hostname to connect to the remote server. Defaults to the
standard FTP port number, 21.

host-ip The IP address to use in conjunction with the remote hostname
to connect to the remote server. This parameter can be used
in place of the hostname parameter, but one or the other must
be used. This parameter is commonly included along with the
hostname parameter as supplemental information.

xtype The transfer type to use. Defaults to binary transfer type
(TYPE I). Valid values are I for binary, A for ASCII text.

passive Whether to use FTP passive data connections (PASV) or FTP
active data connections (PORT). Valid values are 0 for
active, 1 for passive, or 2 to try passive, then fallback to
active. The default is 2.

recursive
This can be used to transfer entire directory trees. By
default, only a single file is transferred. Valid values are
yes or no.

delete This can be used to delete the source file on the source
machine after successfully transferring the file to the
destination machine. By default, source files are not
deleted. Valid values are yes or no.

job-name
This isn't used by the program, but can be used by an entity
which is automatically generating job files. As an example,
when using the -bbb flag with ncftpput, it creates a job file
on stdout with a job-name parameter so you can easily copy
the file to the job queue directory with the suggested job
name as the job file name.

pre-ftp-command

post-ftp-command
These parameters correspond to the -W, and -Y options of
ncftpget and ncftpput. It is important to note that these
refer to RFC959 File Transfer Protocol commands and not shell
commands, nor commands used from within /usr/bin/ftp or
ncftp.

pre-shell-command

post-shell-command
These parameters provide hooks so you can run a custom
program when an item is processed by the spooler. Valid
values are pathnames to scripts or executable programs. Note
that the value must not contain any command-line arguments --
if you want to do that, create a shell script and have it run
your program with the command-line arguments it requires.

Generally speaking, post-shell-command is much more useful than
pre-shell-command since if you need to use these options you're more
likely to want to do something after the FTP transfer has completed
rather than before. For example, you might want to run a shell
script which pages an administrator to notify her that her 37
gigabyte file download has completed.

When your custom program is run, it receives on standard input the
contents of the job file (i.e. several lines of variable=value key-
pairs), as well as additional data the spooler may provide, such as a
result key-pair with a textual description of the job's completion
status.

post-shell-command update a log file named /var/log/ncftp_spooler.

#!/usr/bin/perl -w

my ($line);
my (%params) = ();

while (defined($line = <STDIN>)) {
$params{$1} = $2
if ($line =~ /^([^=\#\s]+)=(.*)/);
}

if ((defined($params{"result"})) &&
($params{"result"} =~ /^Succeeded/))
{
open(LOG, ">> /var/log/ncftp_spooler.log")
or exit(1);
print LOG "DOWNLOAD" if ($params{"op"} eq "get");
print LOG "UPLOAD" if ($params{"op"} eq "put");
print LOG " ", $params{"local-file"}, "\n";
close(LOG);
}

DIAGNOSTICS

The log file should be examined to determine if any ncftpspooler
processes are actively working on jobs. The log contains copious
amounts of useful information, including the entire FTP control
connection conversation between the FTP client and server.

BUGS

The recursive option may not be reliable since ncftpspooler depends
on functionality which may or may not be present in the remote server
software. Additionally, even if the functionality is available,
ncftpspooler may need to use heuristics which cannot be considered
100% accurate. Therefore it is best to create individual jobs for
each file in the directory tree, rather than a single recursive
directory job.

For resumption of downloads to work, the remote server must support
the FTP SIZE and MDTM primitives. Most modern FTP server software
can do this, but there are still a number of bare-bones ftpd
implementations which do not. In these cases, ncftpspooler will re-
download the file in entirety each time until the download succeeds.

The program needs to be improved to detect jobs that have no chance
of ever completing successfully. There are still a number of cases
where jobs can get spooled but get retried over and over again until
a vigilant sysadmin manually removes the jobs.

The spool files may contain usernames and passwords stored in
cleartext. These files should not be readable by any user except the
user running the program!

AUTHOR

Mike Gleason, NcFTP Software (http://www.ncftp.com).