From: phk Date: Thu, 20 Jul 2006 10:55:18 +0000 (+0000) Subject: Rewrite the "components" part to match reality. X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e4541dbc72d5b1395c38968fb3e6a4ab9996c9e3;p=varnish Rewrite the "components" part to match reality. git-svn-id: svn+ssh://projects.linpro.no/svn/varnish/trunk@517 d4fa192b-c00b-0410-8231-f00ffab90ce4 --- diff --git a/varnish-doc/en/varnish-architecture/article.xml b/varnish-doc/en/varnish-architecture/article.xml index dfa81fc8..3e103b3e 100644 --- a/varnish-doc/en/varnish-architecture/article.xml +++ b/varnish-doc/en/varnish-architecture/article.xml @@ -13,88 +13,198 @@ Application structure
- Components + Overview - This section lists the major components in Varnish. + + The Varnish binary contains code for two co-operating + processes: the manager and the cache engine. + -
- Listener + + The manager process is what takes control when the binary + is executed, and after parsing command line arguments it + will compile the VCL code and fork(2) a child process which + executes the cache-engine code. + - The Listener monitors the listening socket and accepts - incoming client connections. Once the connection is - established, it is passed to the Accepter. + + A pipe connects the two processes and allows the manager + to relay and inject CLI commands to the cache process. + - The Listener should take advantage of accept filters or - similar technologies on systems where they are - available. -
+
+ +
+ Manager Process Components + + The manager process is a basic multiplexing process, of relatively + low complexity. The only major component apart from the CLI stream + multiplexer is the VCL compiler. + +
+ Cache Process Components + + + The cache process is where all the fun happens and its components + have been constructed for maximum efficiency at the cost of some + simplicity of structure. + +
- Accepter - - The Accepter reads an HTTP request from a client - connection. It parses the request line and header only to the - extent necessary to establish well-formedness and determine - the requested URL. - - The Accepter then queries the Keeper about the status of - the requested document (identified by its full URL). If the - document is present and valid in the cache, the request is - passed directly to a Sender. Otherwise, it is passed to a - Retriever queue. + Acceptor + + + The Acceptor monitors the listening sockets and accepts + incoming client connections. For each connection a session + is created and once enough bytes have been received to indicate + a valid HTTP request header the established, the session is + passed to the Worker Pool for processing. + + + + If supported by the platform, the Acceptor will use the + accept filters facility. +
- Keeper + Worker Pool + + + The Worker Pool maintains a pool of worker threads which + can process requests through the State engine. Threads + are created as necessary if possible, and if they have seen + no work for a preconfigured amount of time, they will + selfdestruct to reduce resource usage. + + + + Threads are used in most-recently-used order to improve + cache efficiencies and minimize working set. + +
- The Keeper manages the document cache. XXX +
+ State Engine + + + The state engine is responsible for taking each request + through the steps. This is done with a simple finite + state engine which is able to give up the worker thread + if the session is waiting for reasons where having the + worker thread is not necessary for the waiting. + + + XXX: either list the major steps from cache_central.c here + or have a major section on the flow after the components. + (phk prefers the latter.) +
- Sender + Hash and Hash methods - The Sender transfers the contents of the requested - document to the client. It examines the HTTP request header - to determine the correct way in which to do this – Range, - If-Modified-Since, Content-Encoding and other options may - affect the type and amount of data transferred. + + The cache of objects are hashed using a pluggable algorithm. + A central hash management does the high level work while + the actual lookup is done by the pluggable method. + +
- There may be multiple concurrent Sender threads. +
+ Storage and Storage methods + + + Like hashing, storage is split into a high level layer + which calls into pluggable methods. +
- Retriever - - The Retriever is responsible for retrieving documents - from the content servers. It is triggered either by an - Accepter trying to satisfy a request for a document which is - not in the cache, or by the Janitor when a “hot” document is - nearing expiry. Either way, there may be a queue of requests - waiting for the document to arrive; when it does, the - Retriever passes those requests to a Sender. - - There may be multiple concurrent Retriever - threads. + Pass and Pipe modes + + + Requests which the can not or should not be handled by + Varnish can be either passed through or piped through to + the backend. + + + + Passing acts on a per-request basis and tries to make the + connection to both the client and the backend reusable. + + + + Piping acts as a transparent tunnel and whatever happens + for the rest of the lifetime of the client and backend + connection is not interpreted by Varnish. +
- Janitor + Backend sessions + + + Connections to the backend are managed in a pool by the + backend session module. + - The Janitor keeps track of the expiry time of cached - documents and attempts to retrieve fresh copies of documents - which are soon to expire.
- Logger + Logging and Statistics + + + Logging and statistics is done through a shared memory + data segment to which other processes can attach to subscribe + to the data. A library provides the documented interface + for this. + + + + Logging is done in round-robin form and is therefore unaffected + by disk-I/O or other expensive log-handling. + +
- The Logger keeps logs of various types of events in - circular shared-memory buffers. See for details. +
+ Purge/Ban procssing + + When a purge is requested via the CLI interface, the regular + expression is added to the purge list, and all requests are + checked against this list before they are served from cache. + The most recently checked purge is cached in the objects to + avoid repeated checks against the same expression. + - It is the responsibility of each module to feed relevant - log data to the Logger. +
+ VCL calls and VCL runtime + + The state engine uses calls to VCL functions to determine + desired processing of each request. The compiled VCL code + is loaded as a dynamic object and executes at the speed + of compiled code. + + + The VCL and VRT code is responsible for managing the VCL + codes loaded and to provide the proper runtime environement + for them. +
+ +
+ Expiry (and prefetch) + + + Objects in the cache are sorted in "earliest expiry" order + in a binary heap which is monitored. When an object is + a configurable number of seconds from expiring the VCL + code will be asked to determine if the object should be + discarded or prefetched. (Prefetch is not yet implemented). + +
+