Currently, if the system is interrupted after logging in a user (before logging them back out) we lose the original settings. We need the ability to track all settings changes to the system in order to restore any given set, even if the machine is powered off and back on.
We need to investigate the nature of the tradeoff between "ultimate durability" and "technical sophistication" (in terms of making use of some mature persistence technology or not). In theory, making use of raw filesystem calls plus atomic file renames should offer us the best durability performance in terms of the length of time window that the system is vulnerable due to having started to enact settings that it has not yet committed to storage - that is, the sequence
i) invoke settings handler
ii) take settings handler's "old" values and write to disk...... wait .....
iii) when disk write is concluded, perform two file renames which are individually atomic (old file to "backup" name, new file to old file)
would seem to offer the shortest possible lag we could achieve between steps i) and iii) which represents our "window of vulnerability". Further than this all we can try to do is to concentrate on the settings handlers themselves to ensure that they have few failure paths which can operate after the point they have started to issue settings updates.
This functionality overlaps a bit with some of our settings snapshot projects (