mirror of
https://git.hardenedbsd.org/hardenedbsd/HardenedBSD.git
synced 2025-01-11 17:04:19 +01:00
60 lines
2.6 KiB
Plaintext
60 lines
2.6 KiB
Plaintext
|
This is a list of projects for CVS. In general, unlike the things in
|
||
|
the TODO file, these need more analysis to determine if and how
|
||
|
worthwhile each task is.
|
||
|
|
||
|
I haven't gone through TODO, but it's likely that it has entries that
|
||
|
are actually more appropriate for this list.
|
||
|
|
||
|
0. Improved Efficency
|
||
|
|
||
|
* CVS uses a single doubly linked list/hash table data structure for
|
||
|
all of its lists. Since the back links are only used for deleting
|
||
|
list nodes it might be beneficial to use singly linked lists or a
|
||
|
tree structure. Most likely, a single list implementation will not
|
||
|
be appropriate for all uses.
|
||
|
|
||
|
One easy change would be to remove the "type" field out of the list
|
||
|
and node structures. I have found it to be of very little use when
|
||
|
debugging, and each instance eats up a word of memory. This can add
|
||
|
up and be a problem on memory-starved machines.
|
||
|
|
||
|
Profiles have shown that on fast machines like the Alpha, fsortcmp()
|
||
|
is one of the hot spots.
|
||
|
|
||
|
* Dynamically allocated character strings are created, copied, and
|
||
|
destroyed throughout CVS. The overhead of malloc()/strcpy()/free()
|
||
|
needs to be measured. If significant, it could be minimized by using a
|
||
|
reference counted string "class".
|
||
|
|
||
|
* File modification time is stored as a character string. It might be
|
||
|
worthwile to use a time_t internally if the time to convert a time_t
|
||
|
(from struct stat) to a string is greater that the time to convert a
|
||
|
ctime style string (from the entries file) to a time_t. time_t is
|
||
|
an machine-dependant type (although it's pretty standard on UN*X
|
||
|
systems), so we would have to have different conversion routines.
|
||
|
Profiles show that both operations are called about the same number
|
||
|
of times.
|
||
|
|
||
|
* stat() is one of the largest performance bottlenecks on systems
|
||
|
without the 4.4BSD filesystem. By spliting information out of
|
||
|
the filesystem (perhaps the "rename database") we should be
|
||
|
able to improve performance.
|
||
|
|
||
|
* Parsing RCS files is very expensive. This might be unnecessary if
|
||
|
RCS files are only used as containers for revisions, and tag,
|
||
|
revision, and date information was available in easy to read
|
||
|
(and modify) indexes. This becomes very apparent with files
|
||
|
with several hundred revisions.
|
||
|
|
||
|
* A RCS "library", so CVS could operate on RCS files directly.
|
||
|
|
||
|
CVS parses RCS files in order to determine if work needs to be done,
|
||
|
and then RCS parses the files again when it is performing the work.
|
||
|
This would be much faster if CVS could do whatever is necessary
|
||
|
by itself.
|
||
|
|
||
|
1. Improved testsuite/sanity check script
|
||
|
|
||
|
* Need to use a code coverage tool to determine how much the sanity
|
||
|
script tests, and fill in the holes.
|