Oftentimes algorithms require buffers for communication or need to store intermediate results. Or they need to pre-compute and store auxiliary data. Those algorithms are often called in loops with similar or identical parameters but different data. Creating required resources an algorithm needs in every invocation is inefficient and can be a performance bottleneck. I would like to give users a means of reusing objects across multiple calls to algorithms.
The basic idea is to have a generic class that either creates the required objects for you or returns an existing object, if you already created it. Inefficient code like this:
algorithm(T1 input, T2 output, parameters) { buffer mybuff(buffer_parameters); calculation(input, output, buffer); // buffer goes out of scope and is destroyed } for() algorithm(input, output, parameters);
creates a buffer object in each invocation of algorithm. This is oftentimes unnecessary. The buffer object could be reused in subsequent calls which would result in code such as:
algorithm(T1 input, T2 output, parameters, workspace & ws) { buffer & mybuff = ws.get(name, buffer_parameters); calculation(input, output, buffer); // buffer is not destroyed } { workspace ws; for() algorithm(input, output, parameters, ws); } // workspace goes out of scope; all objects are destoryed
The name in the above code acts as an object identifier. If the same object should be used a second time it has to be accessed using the same object identifier.
More Features
Specifying only the name of an object as its identifier might not be enough. The following code could be ambiguous and might result in an error:
buffer & mybuff = ws.get("mybuff", size(1024)); // later call buffer & mybuff = ws.get ("mybuff", size(2048));
When using mybuff after the later call, it might be expected that the buffer has a size of 2048. But the workspace object returns the object created before which is smaller. A simple workaround would be to always resize the buffer before it is used.
As an alternative the workspace object can be instructed to not only consider the name as an object identifier when checking if the object already exists but also some parameter of the constructor. Wrapping parameters with the “arg” method marks them to be considered when resolving existing objects.
buffer & mybuff = ws.get("mybuff", workspace::arg(size(1024))); // later call buffer & mybuff = ws.get ("mybuff", workspace::arg(size(2048)));
This code will create two different buffer objects.
Implementation
The workspace holds a map of pointers to the objects it manages. The map’s key is an arbitrary name for the resource. In addition to the pointer, a functor that properly destroys the object referenced by the pointer is stored.
A call to get<T>
will check if an object with ‘name’ already exists. If not, an object is created using the parameters (as well as a function object to destroy it). If it exists a reference is returned.
If the workspace goes out of scope, the destructor iterates over all object the workspace holds and destroys them in reverse order of creation.
I used some of the nifty new C++0x language features (variadic templates, std::function and the lambda functions) to implement the workspace class. I thought this was a good idea since the Final Draft International Standard was approved recently. I found the features to be very powerful as well as incredibly useful.
The argument dependent lookup is implemented in a crude way: the arguments are appended to the name string as characters. This should be sufficient for basic use and works well for all built-in types.
Arguments that are supposed to be added to the object identifier are detected as they are wrapped in a special_arg class. Partial template specialization and the unpack operator help appending the arguments to the object identifier.
The current implementation can be found in my workspace git repository.
Discussion
It might very well be that something like this already exists in a more elegant form and that I have simply not been aware of or failed to identify as a solution to this problem. I’m for example aware of pool allocators but I believe that I propose a more abstract concept. Buffers that are part of a workspace still could use a pool allocator.
Also the code in the repository is written rather quickly and I’m quite sure there are some rough edges where the code is not as efficient as it could be (passing by reference comes to mind).