Confuse: Painless Configuration¶
Confuse is a straightforward, full-featured configuration system for Python.
Using Confuse¶
Set up your Configuration object, which provides unified access to all of your application’s config settings:
config = confuse.Configuration('MyGreatApp', __name__)
The first parameter is required; it’s the name of your application that
will be used to search the system for config files. The second parameter
is optional: it’s the name of a module that will guide the search for a
defaults file. Use this if you want to include a
config_default.yaml
file inside your package. (The included
example
package does exactly this.)
Now, you can access your configuration data as if it were a simple
structure consisting of nested dicts and lists—except that you need to
call the method .get()
on the leaf of this tree to get the result as
a value:
value = config['foo'][2]['bar'].get()
Under the hood, accessing items in your configuration tree builds up a
view into your app’s configuration. Then, get()
flattens this view
into a file, performing a search through each configuration data source
to find an answer. More on view later.
If you know that a configuration value should have a specific type, just
pass that type to get()
:
int_value = config['number_of_goats'].get(int)
This way, Confuse will either give you an integer or raise a
ConfigTypeError
if the user has messed up the configuration. You’re
safe to assume after this call that int_value
has the right type. If
the key doesn’t exist in any configuration file, Confuse will raise a
NotFoundError
. Together, catching these exceptions (both subclasses
of confuse.ConfigError
) lets you painlessly validate the user’s
configuration as you go.
View Theory¶
The Confuse API is based on the concept of views. You can think of a
view as a place to look in a config file: for example, one view might
say “get the value for key number_of_goats
”. Another might say “get
the value at index 8 inside the sequence for key animal_counts
”. To
get the value for a given view, you resolve it by calling the
get()
method.
This concept separates the specification of a location from the mechanism for retrieving data from a location. (In this sense, it’s a little like XPath: you specify a path to data you want and then you retrieve it.)
Using views, you can write config['animal_counts'][8]
and know that
no exceptions will be raised until you call get()
, even if the
animal_counts
key does not exist. More importantly, it lets you
write a single expression to search many different data sources without
preemtively merging all sources together into a single data structure.
Views also solve an important problem with overriding collections.
Imagine, for example, that you have a dictionary called
deliciousness
in your config file that maps food names to tastiness
ratings. If the default configuration gives carrots a rating of 8 and
the user’s config rates them a 10, then clearly
config['deliciousness']['carrots'].get()
should return 10. But what
if the two data sources have different sets of vegetables? If the user
provides a value for broccoli and zucchini but not carrots, should
carrots have a default deliciousness value of 8 or should Confuse just
throw an exception? With Confuse’s views, the application gets to decide.
The above expression, config['deliciousness']['carrots'].get()
,
returns 10 (falling back on the default). However, you can also write
config['deliciousness'].get()
. This expression will cause the
entire user-specified mapping to override the default one, providing a
dict object like {'broccoli': 7, 'zucchini': 9}
. As a rule, then,
resolve a view at the same granularity you want config files to override
each other.
Validation¶
We saw above that you can easily assert that a configuration value has a
certain type by passing that type to get()
. But sometimes you need
to do more than just type checking. For this reason, Confuse provides a
few methods on views that perform fancier validation or even
conversion:
as_filename()
: Normalize a filename, substituting tildes and absolute-ifying relative paths. The filename is relative to the source that provided it. That is, a relative path in a config file refers to the directory containing the config file. A relative path in the defaults refers to the application’s config directory (config.config_dir()
, as described below). A relative path from any other source (e.g., command-line options) is relative to the working directory.as_choice(choices)
: Check that a value is one of the provided choices. The argument should be a sequence of possible values. If the sequence is adict
, then this method returns the associated value instead of the key.as_number()
: Raise an exception unless the value is of a numeric type.as_pairs()
: Get a collection as a list of pairs. The collection should be a list of elements that are either pairs (i.e., two-element lists) already or single-entry dicts. This can be helpful because, in YAML, lists of single-element mappings have a simple syntax (- key: value
) and, unlike real mappings, preserve order.as_str_seq()
: Given either a string or a list of strings, return a list of strings. A single string is split on whitespace.
For example, config['path'].as_filename()
ensures that you get a
reasonable filename string from the configuration. And calling
config['direction'].as_choice(['up', 'down'])
will raise a
ConfigValueError
unless the direction
value is either “up” or
“down”.
Command-Line Options¶
Arguments to command-line programs can be seen as just another source for configuration options. Just as options in a user-specific configuration file should override those from a system-wide config, command-line options should take priority over all configuration files.
You can use the argparse and optparse modules from the standard
library with Confuse to accomplish this. Just call the set_args
method on any view and pass in the object returned by the command-line
parsing library. Values from the command-line option namespace object
will be added to the overlay for the view in question. For example, with
argparse:
args = parser.parse_args()
config.set_args(args)
Correspondingly, with optparse:
options, args = parser.parse_args()
config.set_args(options)
This call will turn all of the command-line options into a top-level source in your configuration. The key associated with each option in the parser will become a key available in your configuration. For example, consider this argparse script:
config = confuse.Configuration('myapp')
parser = argparse.ArgumentParser()
parser.add_argument('--foo', help='a parameter')
args = parser.parse_args()
config.set_args(args)
print(config['foo'].get())
This will allow the user to override the configured value for key
foo
by passing --foo <something>
on the command line.
Overriding nested values can be accomplished by passing dots=True and have dot-delimited properties on the incoming object.:
parser.add_argument('--bar', help='nested parameter', dest='foo.bar')
args = parser.parse_args() # args looks like: {'foo.bar': 'value'}
config.set_args(args, dots=True)
print(config['foo']['bar'].get())
parse_args works with generic dictionaries too.:
args = {
'foo': {
'bar': 1
}
}
config.set_args(args, dots=True)
print(config['foo']['bar'].get())
Note that, while you can use the full power of your favorite
command-line parsing library, you’ll probably want to avoid specifying
defaults in your argparse or optparse setup. This way, Confuse can use
other configuration sources—possibly your
config_default.yaml
—to fill in values for unspecified
command-line switches. Otherwise, the argparse/optparse default value
will hide options configured elsewhere.
Search Paths¶
Confuse looks in a number of locations for your application’s
configurations. The locations are determined by the platform. For each
platform, Confuse has a list of directories in which it looks for a
directory named after the application. For example, the first search
location on Unix-y systems is $XDG_CONFIG_HOME/AppName
for an
application called AppName
.
Here are the default search paths for each platform:
- OS X:
~/.config/app
and~/Library/Application Support/app
- Other Unix:
$XDG_CONFIG_HOME/app
and~/.config/app
- Windows:
%APPDATA%\app
where the APPDATA environment variable falls back to%HOME%\AppData\Roaming
if undefined
Users can also add an override configuration directory with an
environment variable. The variable name is the application name in
capitals with “DIR” appended: for an application named AppName
, the
environment variable is APPNAMEDIR
.
Your Application Directory¶
Confuse provides a simple helper, Configuration.config_dir()
, that
gives you a directory used to store your application’s configuration. If
a configuration file exists in any of the searched locations, then the
highest-priority directory containing a config file is used. Otherwise,
a directory is created for you and returned. So you can always expect
this method to give you a directory that actually exists.
As an example, you may want to migrate a user’s settings to Confuse from an older configuration system such as ConfigParser. Just do something like this:
config_filename = os.path.join(config.config_dir(),
confuse.CONFIG_FILENAME)
with open(config_filename, 'w') as f:
yaml.dump(migrated_config, f)
Dynamic Updates¶
Occasionally, a program will need to modify its configuration while it’s running. For example, an interactive prompt from the user might cause the program to change a setting for the current execution only. Or the program might need to add a derived configuration value that the user doesn’t specify.
To facilitate this, Confuse lets you assign to view objects using
ordinary Python assignment. Assignment will add an overlay source that
precedes all other configuration sources in priority. Here’s an example
of programmatically setting a configuration value based on a DEBUG
constant:
if DEBUG:
config['verbosity'] = 100
...
my_logger.setLevel(config['verbosity'].get(int))
This example allows the constant to override the default verbosity level, which would otherwise come from a configuration file.
Assignment works be creating a new “source” for configuration data at
the top of the stack. This new source takes priority over all other,
previously-loaded sources. You can cause this explicitly by calling the
set()
method on any view. A related method, add()
, works
similarly but instead adds a new lowest-priority source to the bottom
of the stack. This can be used to provide defaults for options that may
be overridden by previously-loaded configuration files.
YAML Tweaks¶
Confuse uses the PyYAML module to parse YAML configuration files. However, it deviates very slightly from the official YAML specification to provide a few niceties suited to human-written configuration files. Those tweaks are:
- All strings are returned as Python Unicode objects.
- YAML maps are parsed as Python OrderedDict objects. This means that you can recover the order that the user wrote down a dictionary.
- Bare strings can begin with the % character. In stock PyYAML, this will throw a parse error.
To produce a YAML file reflecting a configuration, just call
config.dump()
. If you supply a filename, the YAML will be written to the
file; otherwise, a string is returned. This does not cleanly round-trip YAML,
but it does play some tricks to preserve comments and spacing in the original
file.
Configuring Large Programs¶
One problem that must be solved by a configuration system is the issue of global configuration for complex applications. In a large program with many components and many config options, it can be unwieldy to explicitly pass configuration values from component to component. You quickly end up with monstrous function signatures with dozens of keyword arguments, decreasing code legibility and testability.
In such systems, one option is to pass a single Configuration object through to each component. To avoid even this, however, it’s sometimes appropriate to use a little bit of shared global state. As evil as shared global state usually is, configuration is (in my opinion) one valid use: since configuration is mostly read-only, it’s relatively unlikely to cause the sorts of problems that global values sometimes can. And having a global repository for configuration option can vastly reduce the amount of boilerplate threading-through needed to explicitly pass configuration from call to call.
To use global configuration, consider creating a configuration object in a well-known module (say, the root of a package). But since this object will be initialized at module load time, Confuse provides a LazyConfig object that loads your configuration files on demand instead of when the object is constructed. (Doing complicated stuff like parsing YAML at module load time is generally considered a Bad Idea.)
Global state can cause problems for unit testing. To alleviate this, consider adding code to your test fixtures (e.g., setUp in the unittest module) that clears out the global configuration before each test is run. Something like this:
config.clear()
config.read(user=False)
These lines will empty out the current configuration and then re-load the defaults (but not the user’s configuration files). Your tests can then modify the global configuration values without affecting other tests since these modifications will be cleared out before the next test runs.
Redaction¶
You can also mark certain configuration values as “sensitive” and avoid including them in output. Just set the redact flag:
config['key'].redact = True
Then flatten or dump the configuration like so:
config.dump(redact=True)
The resulting YAML will contain “key: REDACTED” instead of the original data.