Kettle and the Last Thread-Local

Antranig Basman antranig.basman at colorado.edu
Wed Aug 28 05:59:30 UTC 2013


Those not interested in these technical issues can breeze on by, but I wanted to summarise our thinking of 
today whilst it was still fresh in my mind.

Summary is at the top this time, having learned from my 2011 ramble:

1. In theory in a "pure architecture", we would be able to remove all ThreadLocals from client code (Kettle) 
as well as from the core framework (Infusion)
2. In practice, because of error handling and debuggability concerns, we can't.





Last week I assigned Yura to http://issues.fluidproject.org/browse/KETTLE-16 , a simplification of Kettle's 
infrastructure that seemed possible following improvements in Infusion. In the improvements described in 
FLUID-4330 and others, I was successively able to remove uses of "ThreadLocals" wherever they occurred in 
our base framework, as a result of the better record-keeping that allowed us to associate components with 
the original instantiators. This simple lookup is now performed by the utility fluid.getInstantiator -

https://github.com/fluid-project/infusion/blob/master/src/framework/core/js/FluidIoC.js#L749

This is a framework internal and it is a top goal that all users of the framework should be able to get 
their work done without reference to it.

Before going any further I should give some notes about the use of the term "ThreadLocal", which in an 
environment without any threads, such as the browser and node.js, seems confusing and paradoxical. Certainly 
in some hypothetical environments with threads, for example our experimental early version of Kettle from 
2008 based on Java Servlets, and the Rhino JavaScript engine, this term was perfectly standard and accurate. 
In an environment with just one thread, a thing with a remnant of a mission of the original ThreadLocal 
still makes sense - it serves the function of associating a value with a particular stack frame. This is 
still useful in an environment with one thread since, for example as in the new Kettle, multiple separated 
stack frames, linked as successive callbacks to sequences of asynchronous I/O, can be morally packaged 
together as part of the same "unit of work" - in this case, ultimately aimed at serving the same HTTP request.

In node.js, a native feature aimed at this same function is "Domains", described here - 
http://nodejs.org/api/domain.html . It may be useful in future to harmonise our system with this 
implementation, should it become the focus of useful infrastructure developed by others.

In the meantime, the framework's main use of ThreadLocal state can be seen in maintaining what from the 
point of view of the IoC framework is termed the "dynamic environment". The dynamic environment can be seen 
pictured in the following diagram from our docs: http://wiki.fluidproject.org/display/docs/Demand+Resolution 
- it holds a number of IoC components which are expected to be "in scope for resolution", but only from the 
point of view of the current stack frame.


This is awkward and from the point of view of architectural clarity it would be useful to do away with the 
dynamic environment entirely, which I have been indeed trying to do for a number of years now. It seemed 
that allowing the framework to take care of its own bookkeeping without using it would in theory pave the 
way for all client frameworks such as Kettle to act similarly. Now that all [*] uses in the core framework 
are gone, unfortunately it seems that there are some special reasons that suggest that Kettle needs to stay 
the way it is, and that we can't in general withdraw the framework facility of the "dynamic environment".


"In theory", we could adopt the viewpoint that "anyone who needs to perform a request-related activity will 
have access to the request component". This would mean that no special activity would be required when 
propagating I/O callbacks - since at the end of the ultimate callback would be a direct object reference to 
the original request component which could then do its work (service and close the request). This was the 
kind of thinking that was vaguely behind the issuing of KETTLE-16 - the improvements in Infusion that it 
described would then be sufficient to let the IoC system recontextualise itself upon reentering code 
(invokers and listeners) attached to the request component.

For a few reasons unfortunately, this doesn't seem to be practical - the most urgent of which being our 
approach to error handling. Some similarly rambling postings from 2011

http://lists.idrc.ocad.ca/pipermail/fluid-work/2011-June/007987.html and
http://lists.idrc.ocad.ca/pipermail/fluid-work/2011-May/007916.html

explain why exceptions - or in particular, try-catch blocks should be considered unusable in the JavaScript 
language. A vital part of the infrastructure for stability of a Kettle application, then, is the following 
"uncaughtException" handler issued directly at node:

https://github.com/fluid-project/kettle/blob/master/lib/utils.js#L29-L42

The short version of this discussion is - this handler is vital, there is no way to support it without 
ThreadLocals or similar infrastructure - therefore, we can't get rid of them. The Fluid framework failure 
handler below it is covered by just the same reasoning.



The slightly longer version of this discussion, for anyone who dreamed of wanting such a thing, involves 
speculating about reasons why the server-side environment is so fundamentally different to the client-side 
in this way - since it still seems perfectly possible to abolish ThreadLocals from all client-side code. The 
reason I think relates to "user ergonomics" - the equivalent of the "HTTP request" on the server is simply a 
GUI event on the client - and in fact, "the same user" is behind every client-side event. However, on the 
server, the original request object is our only lifeline to the original user - should we lose our thread to 
it, we lose any ability to signal errors or other status to them. By contrast, losing a DOM "event" object 
on the client merely means losing some mostly irrelevant bookkeeping about the coordinates of the mouse 
pointer at a particular time, and what else the user was doing then. Should we need to catch up with the 
user later, he is behind 100% of our markup interface and will most likely be generating a further event in 
a few more ms.

This vital nature of the request object also carries through to thinking about debugging - this kind of 
issue led the node.js team, typically rather hard-nosed and not prone to take debuggability concerns very 
seriously, to invent the idea of their "Domains" in the first place. Time and again we will most likely find 
ourselves in some unfathomable callback nest in node.js, and without the ability to lay our hands on the 
original (node/express/kettle) "request component" ultimately responsible for the current stack frame, we 
will be quite in the dark about what the problem is about. Someday, no doubt, we will integrate this 
facility into whatever debugging tools we start applying for use with the Kettle architecture. On which 
issue, I'm happy to report that the previously apparently moribund "node-inspector" project appears to be 
springing to life again, under some active new management from the Czech Republic:

https://github.com/node-inspector/node-inspector



Summary is at top, having learned from Colin's response from my 2011 mail :)



[*] This is not quite true - one last use remains as part of the "source tracking" system in the 
ChangeApplier, described in JIRAs http://issues.fluidproject.org/browse/FLUID-4679 and 
http://issues.fluidproject.org/browse/FLUID-4633 - this should be removed, which would actually resolve 
FLUID-4679 which describes a case in which the current ThreadLocal system gives the wrong answer.



More information about the fluid-work mailing list