Released: Stormpot 2.2
I have released version 2.2 of Stormpot, my Java object pooling library, to the Maven Central repositories. You can add version 2.2 of Stormpot as a dependency in Maven like this:
<dependency> <groupId>com.github.chrisvest</groupId> <artifactId>stormpot</artifactId> <version>2.2</version> </dependency>
This version is fully backwards compatible with 2.1 and 2.0. This version brings some incremental performance improvements, and also improves the behaviour of the pool when it works with allocators that might periodically fail. I'd like to thank Simon Gibbs for helping out with the latter.
Stormpot 2.2 has seen some incremental performance improvements, guided by a new object pool benchmark suite I've written. The results are plotted on the chart bellow. Hover the mouse over the individual data-points to see their value:
These results were generated by the "semisuite.sh" benchmark on a late 2013 15" Retina MacBook Pro with a 4 core (8 hyper-threaded) 2.3 GHz i7 Haswell CPU, using a snapshot release of JMH (rev. 516:e43d8ad0152c) and Java version 1.7.0_51. Note that the thread count steps by 2, beyond the 4 thread mark. It appears that each of the hyper-threaded virtual cores adds about half a real cores worth of performance. Pretty good if you ask me. Another thing to note about this benchmark is that the cost of allocation, deallocation and checking the expiration of objects, have all been set to zero - no cost - so it is exclusively measuring the performance of claiming and releasing pooled objects. In the benchmarks I had done previously, I also included the time spent by the various pools respective default expiration mechanism. This turned out to heavily penalise my BlazePool, because the default expiration mechanism in Stormpot is time based, and reading out time ended up dominating the performance measured by the benchmark.
The pools are implemented with BlockingQueues as a backing data structure. Previously, the specific BlockingQueue implementation they used was always the LinkedBlockingQueue, but now the pools will pick the LinkedTransferQueue instead, if it is available on the classpath, which is the case on Java 7 and greater. It is the use of the LinkedTransferQueue that is the source of the large relative performance boost that QueuePool received in Stormpot 2.2. BlazePool also uses the LinkedTransferQueue, if available, but the effect is not nearly as pronounced because of the ThreadLocal caching that it does. Going to the queue is still the slow-path in BlazePool.
Speaking of the BlazePool slow-path: another minor performance boost was had by
extracting the slow-path out into its own method. This both increases the likelihood
of the fast-path being inlined, and reduces the number of instructions that
compiles to, in turn reducing instruction cache pressure.
The most significant increase of performance in BlazePool, however, has been the
work done towards reducing false sharing of the claim-state of the individual
objects. It used to be the case that each individual slot in the pool had a reference
to an AtomicInteger, that held the state of that particular slot; dead, live,
claimed, etc. The volatile
int field of the AtomicInteger has now been embedded
within the slot itself, and surrounded by padding. This way, no two slot states will
ever (presumably) end up on the same cache line, thus reducing the potential
contention when multiple threads change the state of different slots. It also means
that each slot now take up more memory: around 184 bytes, though the exact number
depends on implementation details of the particular JVM the code runs on. A side
effect of the reduced false sharing, is that reduces the run-to-run variance when
Eager retrying of failed allocations
When an allocation fails, Stormpot keeps the exception around so that it can bubble it out through a claim call. This is generally a good idea, because it means failures won't go unnoticed.
However, if an allocator experienced a failure that would last for some amount of time - like a network outage, for instance - then the pool could end up full of these exceptions. In other words, exception occupies space in the pool that would have otherwise been used for the poolable objects. Then, when the transient failure heals, clients would have to go through the entire pool, popping out all the exceptions, and only then would the pool begin reallocating objects.
This behaviour, while sort of correct with regards to the API, is unreasonable. So in Stormpot 2.2, the pool now keeps track of failed allocations, and will keep trying to allocate, until it succeeds. That way, transient failures will only leave the pool full of exceptions while they persist, and the pool will heal itself when the transient failure heals - even if the pool isn't being used.
Interruption no longer hinders shutdown
Pools allocate objects in a dedicated background thread. When the pool was shut down, this thread would get signaled with an interrupt, and that's how the shut down process begins. However, it was possible for the allocator to catch the interruption signal, and eat it so that the allocator thread never got to observe it. If this happened, the shut down process would never start, yet someone somewhere had gotten hold of a Completion object, and was now waiting for the shut down process to complete.
This bug has been fixed, such that the allocator thread now also checks shut down flag in every iteration of its work. This way, it will never miss the shutdown signal, even though it might still miss the interrupt itself.
Exceptions now bubble out through claim at most once
BlazePool uses a ThreadLocal variable to cache the last successfully claimed object. This is a tremendous performance boon, but it has some tricky and subtle implications.
When an object is claimed from BlazePool by a thread for the first time, it is removed from the queue of live objects (the live-queue) saved in the thread local cache. When the object is released back into the pool, it is added back into the live-queue. The next time the thread wants to claim an object, it first checks the thread local cache, and if there's an available object in it, claims it.
However, when a thread attempts to claim its thread local object, another thread might have pulled it off the queue, concluded that it had expired, sent it back for reallocation, and that reallocation might have failed. If this happens, the object gets replaced with the exception that failed the allocation. The idea is that the exception should bubble out through claim, so that user code is notified that the allocator isn't quite well at the moment. If a thread is doing a normal, non-thread local claim, then the bubbles out and the object is sent back to the allocator thread for reallocation. But, if the exception bubbled out through a claim from the thread local cache, then we can't send it back to the allocator, because we haven't removed it from the live-queue.
Thus far this is fine, but during the thread local claim, the exception was bubbled out like usual without clearing the thread local cache. This meant that a thread could call claim over and over again, and keep having the same exact exception pop out every time.
This has been fixed by no longer letting exceptions bubble out of the thread local claims. Instead, if there is an exception in the thread local cache, the pool just ignores it, clears the thread local cache and falls back to slow path of claiming from the live-queue.
Expirations are now allowed to throw exceptions
Previously, the documentation for the Expiration interface said that hasExpired was not allowed to throw any exceptions, and that bad things could happen if it did. There was no point to this restriction, however, and it has been lifted. When an Expiration throws an exception, it is taken to mean that the object has expired.
A new Reallocator API
Object pools artificially extend the lifetime of objects. This makes them, the objects, significantly more likely to be promoted to the tenured, or old, garbage collector generation. Assuming objects expire at all, this promotion will cause a slow accretion of garbage in the old generation - garbage that can only be cleaned up with a costly full GC cycle.
To help mitigate this, where possible, pools can now take Reallocators instead of Allocators. A Reallocator allows object instances to be reused, when doing reallocation of expired objects. Reusing means that less garbage is being produced, in turn helping postpone the costly full GC cycle - or perhaps even eliminating it, if the program is restarted at regular intervals anyway.
A new TimeSpreadExpiration
The default expiration in previous versions was a TimeExpiration, where objects expire 10 minutes after they have been created. This is checked on every call to claim. When objects are quick to allocate and the pool is being used quite actively, this expiration policy could mean that all the objects in the pool expired at pretty much the same time. And then suddenly the pool is empty.
Stormpot 2.2 introduces a new default TimeSpreadExpiration, where the expiration of objects is spread uniformly between 8 to 10 minutes after they have been created. This significantly reduces the risk of mass expirations, at the cost of a slightly increased average expiration rate in the long run.
Stormpot 2.2 is fully backwards compatible with 2.1 and 2.0. I have moved the pool
implementations from their respective packages into the common
but left stubs in the old packages to preserve compatibility.
I'm quite pleased with this release. It can be summarised as better performance and better behaviour - particularly when the allocator is misbehaving.
As for the future, I think it's probably about time I looked at some kind of JMX integration. I also have ideas for making an even faster pool, however, at this point I'm not sure it's worth the effort to pursue that further. Stormpot is already faster and more scalable than any other general purpose open source object pool that I know of, by a huge margin.
One more thing, Stormpot now has a mailing list. Join and discuss anything related to object pooling; bugs, performance, etc.