phk [Wed, 11 Jun 2008 21:12:26 +0000 (21:12 +0000)]
Add an optional shortcut:
The parameter session_linger determines how many milliseconds the
workerthread waits to see if another request has arrived, before
handing the session over to the session herder.
If we manage to catch the next request this way, we save a number
of semi-expensive steps, if we hang around too long, the worker-thread
gets to goof off.
A relatively small sample of data from a live server, indicates
that 20% of all requests arrive within 50 msec of the previous
request and 50% within 100msec.
It is not clear at present how these timeintervals relate to client
RTT, or if they are systematically too high, due to the duration
of the detour over the herder.
There is a new line in varnishstat keeping track of how many times
this gamble succeeds.
phk [Mon, 9 Jun 2008 13:35:35 +0000 (13:35 +0000)]
Slight backtracking: Store the specified socket name in the listen socket
structure, instead of whatever we actually bound to. It is more informative
and intuitive.
Report the listen socket spec in the SessionOpen shm record.
Close listen sockets as soon as we have forked the child, this eliminated
a fd-leak when a socket was eliminated from listen_address while the child
ran.
Fix various potential problems related to not being able to bind to one
or more listen addresses.
phk [Sat, 7 Jun 2008 21:19:58 +0000 (21:19 +0000)]
Redo the way we manage the thread pool(s).
This is necessary to generalize the thread pools to do other tasks
for us in the future.
Please read the descriptions of the new and changed thread_pool*
parameters carefully before you tweak them, some of them have
slightly different meanings now.
The high-level view of this is that we now have dedicated a thread to
adding threads to the pools, in addition to the thread we already
had that killed idle threads from the pools.
The difference is that this new thread is quite a bit more reluctant
to add threads than the previous code, which would add a thread any
time it could get away with it.
Hopefully that reduces the tendency for thread-pile-ups.
This commit also reduces the cross-pool locking contention by making
the overflow queue a per pool item.
The down side of that is that more of the stats counters have become
unlocked and thus can get out of sync if two threads race when updating
them. This is an XXX item.
phk [Mon, 2 Jun 2008 19:45:05 +0000 (19:45 +0000)]
Add a "timeout counter" and cleans the timeout sockets every 60
seconds instead of cleaning it in every new connection received.
We are using it here, and the performance is much better now.
Submmitted by: Rafael Umann <rafael.umann@terra.com.br>
phk [Sat, 31 May 2008 21:26:20 +0000 (21:26 +0000)]
Overhaul the regexp purge/ban code, with a detour around the CLI code.
CLI code:
In CLI help, don't list commands with no syntax description.
Add a CLI_HIDDEN macro to construct such entries. They are
useful as backwards compatibility entries which we do not want
to show in the help.
CLI interface to BAN (purge) code:
Get the CLI names right for purging so they are purge.FOO instead
of FOO.purge.
This means that you should use "purge.url" and "purge.hash"
instead of "url.purge" and "hash.purge".
Add compat entries for the old, and keep them through the 2.x
release series.
Add purge.list command to list purges currently in effect.
NB: This is not 100% locking safe, so don't abuse it.
BAN (purge) code:
Add reference counting and GC to bans.
Since we now have full reference counting, drop the sequence
number based soft references and go to "hard" pointer
references from the object to the purges.
Give the "ban" structure the miniobj treatment while we are
at it.
The cost of this is a lock operation to update refcounts
once all applicable bans have been checked on an object.
There is no locking cost if there is no bans to check.
Add explicit call to new BAN_DestroyObj() when objects are
destroyed to release the refcount.
When we release an object refcount in BAN_DestroyObj(),
check if we can destroy the last purge in the list also.
We only destroy one ban per BAN_DestroyObj() call, to avoid
getting stuck too long time, (tacitly assuming that we will
destroy more objects than bans.)
phk [Fri, 30 May 2008 21:39:56 +0000 (21:39 +0000)]
Back in the mists of time, the SocketWizards were so happy to get
any connections at all, that they didn't even consider that maybe
connect(2) should have a timeout argument.
Add TCP_connect() which addresses this shortcoming, using what I
belive is a widely supported workaround.
phk [Mon, 26 May 2008 10:13:52 +0000 (10:13 +0000)]
Be more consistent about sockets and blocking/non-blocking mode:
Accept sockets are non-blocking, to avoid races where the client closes
before we get to accept it. (Spotted by: "chen xiaoyong")
Unfortunately, that means that session sockets inherit non-blocking mode,
which is the opposite of what we want in the worker thread but correct
for the acceptor thread.
We prefer to have the extra syscalls in the worker thread, that complicates
things a little bit.
Use ioctl(FIONBIO) instead of fcntl(2) which is surprisingly expensive.
des [Sun, 16 Mar 2008 14:11:26 +0000 (14:11 +0000)]
The value of HTTP_HDR_MAX is not visible to the preprocessor, so
(IOV_MAX < (HTTP_HDR_MAX * 2)) is equivalent to (IOV_MAX < (0 * 2)),
which obviously is never true. Fixes #222.
phk [Wed, 12 Mar 2008 14:07:08 +0000 (14:07 +0000)]
Further revamp the CLI handling in the cacher process, making it
possible for various modules to add cli functions so they can
be manipulated on the fly.
CLI_AddFuncs() registers a set of CLI functions. We operate
with three lists: the ones not shown in "help" because the
manager already showed them, the normal ones and the debug
commands which are also not shown in a plain "help".
Move the registration of cli functions out to the code they
belong in: VCL, BAN and VCA.
Give VCA a real Init function, and have the cli function ("start")
initiate the acceptor thread which listens for incoming connections.
des [Tue, 11 Mar 2008 09:48:27 +0000 (09:48 +0000)]
VSL_H_Print() prints a 'b' in the third column for backend-related log
entries, a 'c' for client-related log entries, and a ' ' for everything
else (CLI Ping for instance). This makes it hard to process logs with
awk or similar tools, since the number of columns varies. Therefore,
change the character used for non-backend-or-client log entries to '-'.
phk [Mon, 10 Mar 2008 21:21:15 +0000 (21:21 +0000)]
This is slightly experimental:
Reduce SHM mutex contention further, by only holding lock over
reservation of space, and do the copying from workthread buffer
to shm buffer efter we let go of the mutex.
des [Sun, 9 Mar 2008 15:14:04 +0000 (15:14 +0000)]
Add -k option which specifies the number of log entries to keep. Along with
-s, this allows varnishlog to be used to extract part of a log file, or
partition a log file into smaller sections.
des [Sat, 8 Mar 2008 15:42:23 +0000 (15:42 +0000)]
If it looks like a new request starts before a previous request on the same
fd has finished, flush the previous request with an additional line to note
that the request was interrupted.
This is usually a symptom of the child dying midway through the first
request.
phk [Sat, 23 Feb 2008 20:36:33 +0000 (20:36 +0000)]
The expiry module keeps all cached objects on two data structures:
the LRU list and the binary heap. In both of these cases, operations
on one object will result in certain fields in neighbor objects in
these data structures to be updated.
In difference from cache_hash.c which examine objects related by
hash match where the existence of the hash lookup in the first place
is a predictor for their likely use, in cache_expire the neighbor
objects are totally unrelated and the fact that we update their
list pointers or binheap index in no way indicates that they will
get a cache hit any time soon.
Paging in one page for a number of objects, just to move another
object up or down the binheap or LRU list is thus not only slow,
but also increases varnish' VM footprint for no real benefit.
This commit moves the relevant housekeeping fields into a "objexp"
structure, which gets hung off the objects when they enter the cache.
The objexp structure is small (40 bytes on i386) so statistically it
is more than an order of magnitude more likely to already be in core
when we need it, compared to the object itself.