HardenedBSD/gnu/usr.bin/cvs/doc/cvsclient.texi

\input texinfo

@setfilename cvsclient.info

@node Top
@top CVS Client/Server

This manual describes the client/server protocol used by CVS.  It does
not describe how to use or administer client/server CVS; see the
regular CVS manual for that.

@menu
* Goals::             Basic design decisions, requirements, scope, etc.
* Notes::             Notes on the current implementation
* How To::            How to remote your favorite CVS command
* Protocol Notes::    Possible enhancements, limitations, etc. of the protocol
* Protocol::          Complete description of the protocol
@end menu

@node Goals
@chapter Goals

@itemize @bullet
@item
Do not assume any access to the repository other than via this protocol.
It does not depend on NFS, rdist, etc.

@item
Providing a reliable transport is outside this protocol.  It is expected
that it runs over TCP, UUCP, etc.

@item
Security and authentication are handled outside this protocol (but see
below about @samp{cvs kserver}).

@item
This might be a first step towards adding transactions to CVS (i.e. a
set of operations is either executed atomically or none of them is
executed), improving the locking, or other features.  The current server
implementation is a long way from being able to do any of these
things.  The protocol, however, is not known to contain any defects
which would preclude them.

@item
The server never has to have any CVS locks in place while it is waiting
for communication with the client.  This makes things robust in the face
of flaky networks.

@item
Data is transferred in large chunks, which is necessary for good
performance.  In fact, currently the client uploads all the data
(without waiting for server responses), and then waits for one server
response (which consists of a massive download of all the data).  There
may be cases in which it is better to have a richer interraction, but
the need for the server to release all locks whenever it waits for the
client makes it complicated.
@end itemize

@node Notes
@chapter Notes on the Current Implementation

The client is built in to the normal @code{cvs} program, triggered by a
@code{CVSROOT} variable containing a colon, for example
@code{cygnus.com:/rel/cvsfiles}. 

The client stores what is stored in checked-out directories (including
@file{CVS}).  The way these are stored is totally compatible with
standard CVS.  The server requires no storage other than the repository,
which also is totally compatible with standard CVS.

The server is started by @code{cvs server}.  There is no particularly
compelling reason for this rather than making it a separate program
which shares a lot of sources with cvs.

The server can also be started by @code{cvs kserver}, in which case it
does an initial Kerberos authentication on stdin.  If the authentication
succeeds, it subsequently runs identically to @code{cvs server}.

The current server implementation can use up huge amounts of memory
when transmitting a lot of data over a slow link (i.e. the network is
slower than the server can generate the data).  Avoiding this is
tricky because of the goal of not having the server block on the
network when it has locks open (this could lock the repository for
hours if things are running smoothly or longer if not).  Several
solutions are possible.  The two-pass design would involve first
noting what versions of everything we need (with locks in place) and
then sending the data, blocking on the network, with no locks needed.
The lather-rinse-repeat design would involve doing things as it does
now until a certain amount of server memory is being used (10M?), then
releasing locks, and trying the whole update again (some of it is
presumably already done).  One problem with this is getting merges to
work right.  The two-pass design appears to be the more elegant of the
two (it actually reduces the amount of time that locks need to be in
place), but people have expressed concerns about whether it would be
slower (because it traverses the repository twice).  It is not clear
whether this is a real problem (looking for whether a file needs to be
updated and actually checking it out are done separately already), but
I don't think anyone has investigated carefully.  One hybrid approach
which avoids the problem with merges would be to start out in one-pass
mode and switch to two-pass mode if data is backing up--but this
complicates the code and should be undertaken only if the pure
two-pass design is shown to be flawed.

@node How To
@chapter How to add more remote commands

It's the usual simple twelve step process.  Let's say you're making
the existing @code{cvs fix} command work remotely.

@itemize @bullet
@item
Add a declaration for the @code{fix} function, which already implements
the @code{cvs fix} command, to @file{server.c}.
@item
Now, the client side.
Add a function @code{client_fix} to @file{client.c}, which calls 
@code{parse_cvsroot} and then calls the usual @code{fix} function.
@item
Add a declaration for @code{client_fix} to @file{client.h}.
@item
Add @code{client_fix} to the "fix" entry in the table of commands in
@file{main.c}.
@item
Now for the server side.
Add the @code{serve_fix} routine to @file{server.c}; make it do:
@example @code
static void
serve_fix (arg)
    char *arg;
@{
    do_cvs_command (fix);
@}
@end example
@item
Add the server command @code{"fix"} to the table of requests in @file{server.c}.
@item
The @code{fix} function can now be entered in three different situations:
local (the old situation), client, and server.  On the server side it probably
will not need any changes to cope.
Modify the @code{fix} function so that if it is run when the variable
@code{client_active} is set, it starts the server, sends over parsed 
arguments and possibly files, sends a "fix" command to the server,
and handles responses from the server.  Sample code:
@example @code
    if (!client_active) @{
        /* Do whatever you used to do */
    @} else @{
        /* We're the local client.  Fire up the remote server.  */
        start_server ();

        if (local)
            if (fprintf (to_server, "Argument -l\n") == EOF)
                error (1, errno, "writing to server");
        send_option_string (options);

        send_files (argc, argv, local);

        if (fprintf (to_server, "fix\n") == EOF)
            error (1, errno, "writing to server");
        err = get_responses_and_close ();
    @}
@end example
@item
Build it locally.  Copy the new version into somewhere on the 
remote system, in your path so that @code{rsh host cvs} finds it.
Now you can test it.
@item
You may want to set the environment variable @code{CVS_CLIENT_PORT} to
-1 to prevent the client from contacting the server via a direct TCP
link.  That will force the client to fall back to using @code{rsh},
which will run your new binary.
@item
Set the environment variable @code{CVS_CLIENT_LOG} to a filename prefix
such as @file{/tmp/cvslog}.  Whenever you run a remote CVS command,
the commands and responses sent across the client/server connection
will be logged in @file{/tmp/cvslog.in} and @file{/tmp/cvslog.out}.
Examine them for problems while you're testing.
@end itemize

This should produce a good first cut at a working remote @code{cvs fix}
command.  You may have to change exactly how arguments are passed,
whether files or just their names are sent, and how some of the deeper
infrastructure of your command copes with remoteness.

@node Protocol Notes
@chapter Notes on the Protocol

A number of enhancements are possible:

@itemize @bullet
@item
The @code{Modified} request could be speeded up by sending diffs rather
than entire files.  The client would need some way to keep the version
of the file which was originally checked out, which would double client
disk space requirements or require coordination with editors (e.g. maybe
it could use emacs numbered backups).  This would also allow local
operation of @code{cvs diff} without arguments.

@item
Have the client keep a copy of some part of the repository.  This allows
all of @code{cvs diff} and large parts of @code{cvs update} and
@code{cvs ci} to be local.  The local copy could be made consistent with
the master copy at night (but if the master copy has been updated since
the latest nightly re-sync, then it would read what it needs to from the
master).

@item
Provide encryption using kerberos.

@item
The current procedure for @code{cvs update} is highly sub-optimal if
there are many modified files.  One possible alternative would be to
have the client send a first request without the contents of every
modified file, then have the server tell it what files it needs.  Note
the server needs to do the what-needs-to-be-updated check twice (or
more, if changes in the repository mean it has to ask the client for
more files), because it can't keep locks open while waiting for the
network.  Perhaps this whole thing is irrelevant if client-side
repositories are implemented, and the rcsmerge is done by the client.
@end itemize

@node Protocol
@chapter The CVS client/server protocol

@menu
* Entries Lines::               
* Modes::                       
* Requests::                    
* Responses::                   
* Example::                     
@end menu

@node Entries Lines
@section Entries Lines

Entries lines are transmitted as:

@example
/ @var{name} / @var{version} / @var{conflict} / @var{options} / @var{tag_or_date}
@end example

@var{tag_or_date} is either @samp{T} @var{tag} or @samp{D} @var{date}
or empty.  If it is followed by a slash, anything after the slash
shall be silently ignored.

@var{version} can be empty, or start with @samp{0} or @samp{-}, for no
user file, new user file, or user file to be removed, respectively.

@var{conflict}, if it starts with @samp{+}, indicates that the file had
conflicts in it.  The rest of @var{conflict} is @samp{=} if the
timestamp matches the file, or anything else if it doesn't.  If
@var{conflict} does not start with a @samp{+}, it is silently ignored.

@node Modes
@section Modes

A mode is any number of repetitions of

@example
@var{mode-type} = @var{data}
@end example

separated by @samp{,}.

@var{mode-type} is an identifier composed of alphanumeric characters.
Currently specified: @samp{u} for user, @samp{g} for group, @samp{o} for
other, as specified in POSIX.  If at all possible, give these their
POSIX meaning and use other mode-types for other behaviors.  For
example, on VMS it shouldn't be hard to make the groups behave like
POSIX, but you would need to use ACLs for some cases.

@var{data} consists of any data not containing @samp{,}, @samp{\0} or
@samp{\n}.  For @samp{u}, @samp{g}, and @samp{o} mode types, data
consists of alphanumeric characters, where @samp{r} means read, @samp{w}
means write, @samp{x} means execute, and unrecognized letters are
silently ignored.

@node Requests
@section Requests

File contents (noted below as @var{file transmission}) can be sent in
one of two forms.  The simpler form is a number of bytes, followed by a
newline, followed by the specified number of bytes of file contents.
These are the entire contents of the specified file.  Second, if both
client and server support @samp{gzip-file-contents}, a @samp{z} may
precede the length, and the `file contents' sent are actually compressed
with @samp{gzip}.  The length specified is that of the compressed
version of the file.

In neither case are the file content followed by any additional data.
The transmission of a file will end with a newline iff that file (or its
compressed form) ends with a newline.

@table @code
@item Root @var{pathname} \n
Response expected: no.
Tell the server which @code{CVSROOT} to use.

@item Valid-responses @var{request-list} \n
Response expected: no.
Tell the server what responses the client will accept.
request-list is a space separated list of tokens.

@item valid-requests \n
Response expected: yes.
Ask the server to send back a @code{Valid-requests} response.

@item Repository @var{repository} \n
Response expected: no.  Tell the server what repository to use.  This
should be a directory name from a previous server response.  Note that
this both gives a default for @code{Entry } and @code{Modified } and
also for @code{ci} and the other commands; normal usage is to send a
@code{Repository } for each directory in which there will be an
@code{Entry } or @code{Modified }, and then a final @code{Repository }
for the original directory, then the command.

@item Directory @var{local-directory} \n
Additional data: @var{repository} \n.  This is like @code{Repository},
but the local name of the directory may differ from the repository name.
If the client uses this request, it affects the way the server returns
pathnames; see @ref{Responses}.  @var{local-directory} is relative to
the top level at which the command is occurring (i.e. the last
@code{Directory} or @code{Repository} which is sent before the command).

@item Max-dotdot @var{level} \n
Tell the server that @var{level} levels of directories above the
directory which @code{Directory} requests are relative to will be
needed.  For example, if the client is planning to use a
@code{Directory} request for @file{../../foo}, it must send a
@code{Max-dotdot} request with a @var{level} of at least 2.
@code{Max-dotdot} must be sent before the first @code{Directory}
request.

@item Static-directory \n
Response expected: no.  Tell the server that the directory most recently
specified with @code{Repository} or @code{Directory} should not have
additional files checked out unless explicitly requested.  The client
sends this if the @code{Entries.Static} flag is set, which is controlled
by the @code{Set-static-directory} and @code{Clear-static-directory}
responses.

@item Sticky @var{tagspec} \n
Response expected: no.  Tell the server that the directory most recently
specified with @code{Repository} has a sticky tag or date @var{tagspec}.
The first character of @var{tagspec} is @samp{T} for a tag, or @samp{D}
for a date.  The remainder of @var{tagspec} contains the actual tag or
date.

@item Checkin-prog @var{program} \n
Response expected: no.  Tell the server that the directory most recently
specified with @code{Directory} has a checkin program @var{program}.
Such a program would have been previously set with the
@code{Set-checkin-prog} response.

@item Update-prog @var{program} \n
Response expected: no.  Tell the server that the directory most recently
specified with @code{Directory} has an update program @var{program}.
Such a program would have been previously set with the
@code{Set-update-prog} response.

@item Entry @var{entry-line} \n
Response expected: no.  Tell the server what version of a file is on the
local machine.  The name in @var{entry-line} is a name relative to the
directory most recently specified with @code{Repository}.  If the user
is operating on only some files in a directory, @code{Entry} requests
for only those files need be included.  If an @code{Entry} request is
sent without @code{Modified}, @code{Unchanged}, or @code{Lost} for that
file the meaning depends on whether @code{UseUnchanged} has been sent;
if it has been it means the file is lost, if not it means the file is
unchanged.

@item Modified @var{filename} \n
Response expected: no.  Additional data: mode, \n, file transmission.
Send the server a copy of one locally modified file.  @var{filename} is
relative to the most recent repository sent with @code{Repository}.  If
the user is operating on only some files in a directory, only those
files need to be included.  This can also be sent without @code{Entry},
if there is no entry for the file.

@item Lost @var{filename} \n
Response expected: no.  Tell the server that @var{filename} no longer
exists.  The name is relative to the most recent repository sent with
@code{Repository}.  This is used for any case in which @code{Entry} is
being sent but the file no longer exists.  If the client has issued the
@code{UseUnchanged} request, then this request is not used.

@item Unchanged @var{filename} \n
Response expected: no.  Tell the server that @var{filename} has not been
modified in the checked out directory.  The name is relative to the most
recent repository sent with @code{Repository}.  This request can only be
issued if @code{UseUnchanged} has been sent.

@item UseUnchanged \n
Response expected: no.  Tell the server that the client will be
indicating unmodified files with @code{Unchanged}, and that files for
which no information is sent are nonexistent on the client side, not
unchanged.  This is necessary for correct behavior since only the server
knows what possible files may exist, and thus what files are
nonexistent.

@item Argument @var{text} \n
Response expected: no.
Save argument for use in a subsequent command.  Arguments
accumulate until an argument-using command is given, at which point
they are forgotten.

@item Argumentx @var{text} \n
Response expected: no.  Append \n followed by text to the current
argument being saved.

@item Global_option @var{option} \n
Transmit one of the global options @samp{-q}, @samp{-Q}, @samp{-l},
@samp{-t}, @samp{-r}, or @samp{-n}.  @var{option} must be one of those
strings, no variations (such as combining of options) are allowed.  For
graceful handling of @code{valid-requests}, it is probably better to
make new global options separate requests, rather than trying to add
them to this request.

@item expand-modules \n
Response expected: yes.  Expand the modules which are specified in the
arguments.  Returns the data in @code{Module-expansion} responses.  Note
that the server can assume that this is checkout or export, not rtag or
rdiff; the latter do not access the working directory and thus have no
need to expand modules on the client side.

@item co \n
@itemx update \n
@itemx ci \n
@itemx diff \n
@itemx tag \n
@itemx status \n
@itemx log \n
@itemx add \n
@itemx remove \n
@itemx rdiff \n
@itemx rtag \n
@itemx import \n
@itemx admin \n
@itemx export \n
@itemx history \n
@itemx release \n
Response expected: yes.  Actually do a cvs command.  This uses any
previous @code{Argument}, @code{Repository}, @code{Entry},
@code{Modified}, or @code{Lost} requests, if they have been sent.  The
last @code{Repository} sent specifies the working directory at the time
of the operation.  No provision is made for any input from the user.
This means that @code{ci} must use a @code{-m} argument if it wants to
specify a log message.

@item update-patches \n
This request does not actually do anything.  It is used as a signal that
the server is able to generate patches when given an @code{update}
request.  The client must issue the @code{-u} argument to @code{update}
in order to receive patches.

@item gzip-file-contents @var{level} \n
This request asks the server to filter files it sends to the client
through the @samp{gzip} program, using the specified level of
compression.  If this request is not made, the server must not do any
compression.

This is only a hint to the server.  It may still decide (for example, in
the case of very small files, or files that already appear to be
compressed) not to do the compression.  Compression is indicated by a
@samp{z} preceding the file length.

Availability of this request in the server indicates to the client that
it may compress files sent to the server, regardless of whether the
client actually uses this request.

@item @var{other-request} @var{text} \n
Response expected: yes.
Any unrecognized request expects a response, and does not
contain any additional data.  The response will normally be something like
@samp{error  unrecognized request}, but it could be a different error if
a previous command which doesn't expect a response produced an error.
@end table

When the client is done, it drops the connection.

@node Responses
@section Responses

After a command which expects a response, the server sends however many
of the following responses are appropriate.  Pathnames are of the actual
files operated on (i.e. they do not contain @samp{,v} endings), and are
suitable for use in a subsequent @code{Repository} request.  However, if
the client has used the @code{Directory} request, then it is instead a
local directory name relative to the directory in which the command was
given (i.e. the last @code{Directory} before the command).  Then a
newline and a repository name (the pathname which is sent if
@code{Directory} is not used).  Then the slash and the filename.  For
example, for a file @file{i386.mh} which is in the local directory
@file{gas.clean/config} and for which the repository is
@file{/rel/cvsfiles/devo/gas/config}:

@example
gas.clean/config/
/rel/cvsfiles/devo/gas/config/i386.mh
@end example

Any response always ends with @samp{error} or @samp{ok}.  This indicates
that the response is over.

@table @code
@item Valid-requests @var{request-list} \n
Indicate what requests the server will accept.  @var{request-list}
is a space separated list of tokens.  If the server supports sending
patches, it will include @samp{update-patches} in this list.  The
@samp{update-patches} request does not actually do anything.

@item Checked-in @var{pathname} \n
Additional data: New Entries line, \n.  This means a file @var{pathname}
has been successfully operated on (checked in, added, etc.).  name in
the Entries line is the same as the last component of @var{pathname}.

@item New-entry @var{pathname} \n
Additional data: New Entries line, \n.  Like @code{Checked-in}, but the
file is not up to date.

@item Updated @var{pathname} \n
Additional data: New Entries line, \n, mode, \n, file transmission.  A
new copy of the file is enclosed.  This is used for a new revision of an
existing file, or for a new file, or for any other case in which the
local (client-side) copy of the file needs to be updated, and after
being updated it will be up to date.  If any directory in pathname does
not exist, create it.

@item Merged @var{pathname} \n
This is just like @code{Updated} and takes the same additional data,
with the one difference that after the new copy of the file is enclosed,
it will still not be up to date.  Used for the results of a merge, with
or without conflicts.

@item Patched @var{pathname} \n
This is just like @code{Updated} and takes the same additional data,
with the one difference that instead of sending a new copy of the file,
the server sends a patch produced by @samp{diff -u}.  This client must
apply this patch, using the @samp{patch} program, to the existing file.
This will only be used when the client has an exact copy of an earlier
revision of a file.  This response is only used if the @code{update}
command is given the @samp{-u} argument.

@item Checksum @var{checksum}\n
The @var{checksum} applies to the next file sent over via
@code{Updated}, @code{Merged}, or @code{Patched}.  In the case of
@code{Patched}, the checksum applies to the file after being patched,
not to the patch itself.  The client should compute the checksum itself,
after receiving the file or patch, and signal an error if the checksums
do not match.  The checksum is the 128 bit MD5 checksum represented as
32 hex digits.  This response is optional, and is only used if the
client supports it (as judged by the @code{Valid-responses} request).

@item Copy-file @var{pathname} \n
Additional data: @var{newname} \n.  Copy file @var{pathname} to
@var{newname} in the same directory where it already is.  This does not
affect @code{CVS/Entries}.

@item Removed @var{pathname} \n
The file has been removed from the repository (this is the case where
cvs prints @samp{file foobar.c is no longer pertinent}).

@item Remove-entry @var{pathname} \n
The file needs its entry removed from @code{CVS/Entries}, but the file
itself is already gone (this happens in response to a @code{ci} request
which involves committing the removal of a file).

@item Set-static-directory @var{pathname} \n
This instructs the client to set the @code{Entries.Static} flag, which
it should then send back to the server in a @code{Static-directory}
request whenever the directory is operated on.  @var{pathname} ends in a
slash; its purpose is to specify a directory, not a file within a
directory.

@item Clear-static-directory @var{pathname} \n
Like @code{Set-static-directory}, but clear, not set, the flag.

@item Set-sticky @var{pathname} \n
Additional data: @var{tagspec} \n.  Tell the client to set a sticky tag
or date, which should be supplied with the @code{Sticky} request for
future operations.  @var{pathname} ends in a slash; its purpose is to
specify a directory, not a file within a directory.  The first character
of @var{tagspec} is @samp{T} for a tag, or @samp{D} for a date.  The
remainder of @var{tagspec} contains the actual tag or date.

@item Clear-sticky @var{pathname} \n
Clear any sticky tag or date set by @code{Set-sticky}.

@item Set-checkin-prog @var{dir} \n
Additional data: @var{prog} \n.  Tell the client to set a checkin
program, which should be supplied with the @code{Checkin-prog} request
for future operations.

@item Set-update-prog @var{dir} \n
Additional data: @var{prog} \n.  Tell the client to set an update
program, which should be supplied with the @code{Update-prog} request
for future operations.

@item Module-expansion @var{pathname} \n
Return a file or directory which is included in a particular module.
@var{pathname} is relative to cvsroot, unlike most pathnames in
responses.

@item M @var{text} \n
A one-line message for the user.

@item E @var{text} \n
Same as @code{M} but send to stderr not stdout.

@item error @var{errno-code} @samp{ } @var{text} \n
The command completed with an error.  @var{errno-code} is a symbolic
error code (e.g. @code{ENOENT}); if the server doesn't support this
feature, or if it's not appropriate for this particular message, it just
omits the errno-code (in that case there are two spaces after
@samp{error}).  Text is an error message such as that provided by
strerror(), or any other message the server wants to use.

@item ok \n
The command completed successfully.
@end table

@node Example
@section Example

Lines beginning with @samp{c>} are sent by the client; lines beginning
with @samp{s>} are sent by the server; lines beginning with @samp{#} are
not part of the actual exchange.

@example
c> Root /rel/cvsfiles
# In actual practice the lists of valid responses and requests would
# be longer
c> Valid-responses Updated Checked-in M ok error
c> valid-requests
s> Valid-requests Root co Modified Entry Repository ci Argument Argumentx
s> ok
# cvs co devo/foo
c> Argument devo/foo
c> co
s> Updated /rel/cvsfiles/devo/foo/foo.c
s> /foo.c/1.4/Mon Apr 19 15:36:47 1993 Mon Apr 19 15:36:47 1993//
s> 26
s> int mein () @{ abort (); @}
s> Updated /rel/cvsfiles/devo/foo/Makefile
s> /Makefile/1.2/Mon Apr 19 15:36:47 1993 Mon Apr 19 15:36:47 1993//
s> 28
s> foo: foo.c
s>         $(CC) -o foo $<
s> ok
# In actual practice the next part would be a separate connection.
# Here it is shown as part of the same one.
c> Repository /rel/cvsfiles/devo/foo
# foo.c relative to devo/foo just set as Repository.
c> Entry /foo.c/1.4/Mon Apr 19 15:36:47 1993 Mon Apr 19 15:36:47 1993//
c> Entry /Makefile/1.2/Mon Apr 19 15:36:47 1993 Mon Apr 19 15:36:47 1993//
c> Modified foo.c
c> 26
c> int main () @{ abort (); @}
# cvs ci -m <log message> foo.c
c> Argument -m
c> Argument Well, you see, it took me hours and hours to find this typo and I
c> Argumentx searched and searched and eventually had to ask John for help.
c> Argument foo.c
c> ci
s> Checked-in /rel/cvsfiles/devo/foo/foo.c
s> /foo.c/1.5/ Mon Apr 19 15:54:22 CDT 1993//
s> M Checking in foo.c;
s> M /cygint/rel/cvsfiles/devo/foo/foo.c,v  <--  foo.c
s> M new revision: 1.5; previous revision: 1.4
s> M done
s> ok
@end example
@bye