ACEDB - in the Long Term

Introduction

Probably most of the items on this page will be either major coding efforts or will reflect some probable change in the genome mapping community or computing industry (e.g. the rise and rise of Windows/NT ??).

Windows support

In the longer term serious consideration needs to be given to the long term support of windows platforms. This could be done in a number of phases:

  1. Support windows as a client only, with the database still on Unix platforms. Initially support could be via a Java or perl based application to avoid the need to learn windows like calls.
  2. Port the server to windows separately from the clients.

Threading

Currently there is no part of acedb code that is threaded. The only place that "overlapping" execution takes place is in the acedb server when a client makes a very large request. In this case the server chops up the request and only executes and returns a portion at a time to the client.

There are three major reasons for threading an application:

  1. To increase throughput
  2. To increase responsiveness of GUI applications (e.g. netscape)
  3. To simplify code where several tasks may need to be handled 'simultaneously' (perhaps the overlapping mentioned above would be an example of this).

There are two major parts to threading acedb code:

The major reasons for threading the kernel are to increase throughput and perhaps to make some of the handling of multiple clients conceptually easier. Note that throughput may be servicing a large request for one particular client (e.g. a large query search in parallel) or lots of smaller requests for multiple clients. For the acedb GUI applications the major reason would be to allow the GUI to remain responsive while perhaps waiting for data from a network connection.

Threading will require a number of preliminary steps:

introduction of tighter layering
Although a large part of acedb is now in self contained libraries, there is still a large body of code that is not layered. Much of the kernel code needs to be more strictly layered to allow different parts of it to be threaded. This is an important preliminary step towards threading because it will reduce the linkage between different areas of code to well defined interfaces. It will also reveal which layers can be reentrant and which by their nature cannot.
Making the base libraries reentrant
Substantial recoding of base level libraries needs to take place because of the extensive use of static data and lack of context passing for the various packages. This will be fundamental because the base libraries are used throughout acecb code and they must be fully re-entrant, or if not then they must at least be serially reusable with mutex locks around vulnerable code.
The acedb kernel needs to be analysed for places where threading can improve performance perhaps by overlapping disk I/O or by overlapping tasks that are waiting for network/client input or by using multiple threads for tasks that are inherently parallel.

Conversion from C to C++

Currently acedb is written in C, this is because work on acedb was started before robust implementations of C++ were generally available on a wide variety of platforms. Work has continued to be in C and there is some sense to this in that the standard ANSI C library is available on all unix platforms, windows and some versions of the MAC, providing a constant programming environment. C++ libraries are not usually available unless the C++ compiler has been installed. Perhaps even more importantly the C++ Standard Template Library (STL), which would provide a major incentive for swopping to C++ is even less generally available. Never the less C++ has much to offer that acedb could make use of. Large parts of the acedb code use 'object' like concepts and deal with 'object' like data that could more usefully become fully fledged objects, blah, blah, blah. Currently this is all pipe dreams, there aren't the resources/experience to do much about this....pity.

Client-server only

Currently there are two models for applications accessing databases:

  1. The application accesses the database directly and makes whatever calls it likes to access the database.
  2. The application is a client and can only make certain client requests to a database server.

It may simplify coding/maintenance etc. to have only one model, the client/server one. Performance issues could be solved by providing different methods of communication between the client and server: if both are on the same machine, shared memory could be used for speed, if on different machines then sockets or whatever could be used.

A different database

Should acedb be replaced with oracle ? should users be able to keep their data in an Oracle database but have a transaction supplied which extract acedb like data, this way oracle would deal with concurrency/backups/disasters etc but the front end could still work on acedb like data.