phk [Wed, 21 Jan 2009 23:00:43 +0000 (23:00 +0000)]
Enforce a minimum ttl for "hit for pass" objects to prevent a value
of zero from serializing access to an object with very draconian
backend cache-control headers.
We could get far even with a one second TTL, but following our general
"there is a reason people put Varnish there in the first place" logic
we use the default_ttl parameter (default: 120 s) for this value.
If another value is desired, this can be set in vcl_fetch, even if it
looks somewhat counter-intuitive:
sub vcl_fetch {
if (obj.http.set-cookie) {
set obj.ttl = 10s;
pass;
}
}
phk [Mon, 19 Jan 2009 11:45:03 +0000 (11:45 +0000)]
Add a new paramter "purge_hash" which defaults to "off".
Only save the hash-string in the session workspace and objects when
this paramter is set to "on".
For sites with many small objects, this will save significant VM.
When this paramter is set to "off", the "purge.hash" facility will
not work, but this should not be a problem, because the new purging
facility allow much more expressive purging, the typical case
being:
Now, why would you want purge on request headers and not object headers ?
Simple, some information the object does not have, the Host: header is
a good example.
Assuming that the Host: header is part of the hash we use to lookup
an object (as is the default), we can avoid copying that field into
the object (saving memory: O(nObjects)) by using the request value
to purge against.
phk [Wed, 14 Jan 2009 20:40:54 +0000 (20:40 +0000)]
After HSH_Lookup() returns NULL indicating a busy object, we diddled
the session a bit to transfer the per-request stats to the session
counters with SES_Charge().
Not only was it inconsistent to charge accounting data in the middle
of a request, it was also illegal because after the hash lock was
released we no longer owned the session.
Once a system is under sufficient load that there is a queue for the
CPU, a race could happen where upon hitting a busy object, the hash lock
was released, another thread would schedule, finish the busy object,
start the sessions on the waiting list, finish off the request we had
and then when we get the cpu again and access it, it's gone.
The previous commit (r3512) eliminated the need to call SES_Charge,
this commit removes the (option) shmlog message inside the hash lock
thus, hopefully, eliminating the race that caused #418.
phk [Wed, 14 Jan 2009 20:28:27 +0000 (20:28 +0000)]
Originally we shaved 64 bytes from the session to the worker thread
by keeping the current requests accounting stats in the worker thread.
For reasons which will be explained in the next commit, this is no
longer a good idea, and this commit moves these counters from
the worker thread to the session at a slight but all in all
trivial cost in memory footprint.
Remove the call to SES_Charge() when we hit a busy object, it is
not necessary to clean the worker thread counters here now.
Move these counters from the worker thread to the see
phk [Sat, 10 Jan 2009 22:11:26 +0000 (22:11 +0000)]
If we get more HTTP headers than we have room for (default: 28) we
used to ignore the rest.
This is not a bright solution if crucial HTTP headers like
"Content-Length" or "Transfer-Encoding" are last and get ignored.
In general, it is highly suspect to randomly ignore HTTP headers,
as opposed to deliberately ignoring them, either by having first
looked at them and found them uninteresting, or by having looked
for the headers we care about, and having not matched some others.
Change too many headers to firm error condition: 400 if from the
client, and 503 (like every other trouble) if from the backend.
phk [Thu, 18 Dec 2008 11:30:02 +0000 (11:30 +0000)]
Change the logic that decides when we attempt EOF fetches from the
backend.
The new logic is:
If (HEAD) /* happens only on pass */
do not fetch body.
else if (Content-Length)
fetch body according to length
else if (chunked)
fetch body as chunked
else if (other transfer-encoding)
fail
else if (Connection: keep-alive)
fetch no body, set Length = 0
else if (Connection: close)
fetch body until EOF
else if (HTTP < 1.1)
fetch body until EOF
else
fetch no body, set Length = 0
let me know if this breaks anything that should work.
phk [Wed, 3 Dec 2008 10:49:34 +0000 (10:49 +0000)]
Add preliminary version of lock-less tree based lookup (see below)
Enable SHA256 digests by default, and put it in the objhead. This
increases the size of the objhead by 32 bytes, but may drop
a bit again later, when other now unnecessary fields go away.
Test SHA256 for correct operation on startup.
About the "critbit" lookup:
To enable this, use "-hcritbit" argument.
"Crit Bit" trees, are also known under various other names, the original
version of the idea is probably the PATRICIA tree.
The basic concept is a tree structure which has nodes only where necessary
to tell the indices apart.
Our version of it, has some additional bells and whistles.
First lookups do not require any locks until we reach the objhead
we were looking for, or until we need to insert one which wasn't
there.
Second, the branch nodes are part of the objhead, as all but the
very first will need one, this saves malloc operations big time.
Now the down-sides:
There are still missing bits, amongst these the "cooling off" list,
for objheads that have been dereferenced, but where the branch-node
is not. Currently we just leak that memory.
There is a race relating to node deref and unlocked lookup that is
not closed, weird things may happen until I fix it.
I'd be interested to hear how long it survives before it croaks,
but apart from that, would not advocate that you use it, until
I fix those remaining issues.
phk [Tue, 2 Dec 2008 20:48:11 +0000 (20:48 +0000)]
Fix an embarrasing bug in my Flexlinting of this code yesterday, and
add a couple of test-vectors to avoid it happening again.
And now for the funny and educational story:
In july of 1994, I added the "libmd" to FreeBSD, containing the
MD2, MD4 and MD5 functions from RFC 1319, RFC 1186 and RFC1321.
I meticulously replicated the test-vectors from the RFCs, so that
"make test" would validate the result.
Duing the intermediate 14 years, various slight shifts and adjustments
to things like the make(1) programs defaults, the shared library
resolution algorithm and other totally unrelated things, meant that
"make test" now tests the installed version of the library, rather
than the version you just built with "make all".
Needless to say, when I tested my patch yesterday, I didn't install
the built version, wanting first to hear what Colin Percival, FreeBSD
Security Wiz, generally swell fella and the guy who wrote this
SHA256 implementation, thought of these "stylistic" patches.
phk [Tue, 25 Nov 2008 12:02:10 +0000 (12:02 +0000)]
Try to get the endianess optimization working, by including an assortment
of possibly relevant headers and only go with the fast path if we have
credible information that this is a big-endian platform.
phk [Tue, 25 Nov 2008 08:37:34 +0000 (08:37 +0000)]
When we receive an If-Modified-Since on an ESI object, do not process the conditional
for the child object and pretend to send a 304 reply for them, if we have decided to
deliver the main object.