API Reference
Configuration
- class yaml_provenance.ProvenanceConfig(category_hierarchy=None, on_conflict='raise', track_history=False, custom_type_handlers=None, conflict_resolver=None)[source]
Configuration for provenance tracking behavior.
- Parameters:
category_hierarchy (list or None) – Ordered list of category names from lowest to highest priority. Default:
[None](single level, no hierarchy enforcement).on_conflict (str) – What to do when two values at the same hierarchy level conflict. One of
"raise","warn", or"ignore". Default:"raise".track_history (bool) – Whether to keep the full provenance history. When
False(default), provenance lists have at most 1 element for minimal overhead.custom_type_handlers (dict or None) – Mapping of
{type: callable(value, provenance) -> wrapped}for types that cannot be dynamically subclassed (e.g. custom Date classes).conflict_resolver (callable or None) – A callback
(key, old_val, new_val, old_prov, new_prov) -> actionfor custom conflict resolution. Return"raise","keep_old","keep_new", or"ignore". IfNone, uses the default behavior based onon_conflict.
- yaml_provenance.configure(config=None)[source]
Set the module-level default
ProvenanceConfig.- Parameters:
config (ProvenanceConfig or None) – The configuration to use as default. If
None, resets to default.
Core Classes
- class yaml_provenance.Provenance(provenance_data, track_history=True)[source]
A subclass of list where each element represents the provenance of a value at a point in its history. Supports both full history tracking and lightweight mode (at most 1 element).
- Parameters:
provenance_data (list or dict) – List of provenance elements, or a single provenance element.
track_history (bool) – If
False, the list keeps at most 1 element (current provenance only). Default:True(full history).
- add_modified_by(provenance_step, func, modified_by='modified_by')[source]
Adds a
modified_byentry to the given provenance step.- Parameters:
provenance_step (dict) – Provenance entry of the current step.
func (str) – Function triggering this method.
modified_by (str) – Name of the key for labelling the type of modification.
- Returns:
provenance_step – The provenance step with the
modified_byitem added.- Return type:
dict
- append_last_step_modified_by(func)[source]
Copies the last element in the provenance history and adds the entry
modified_bywith valuefuncto the copy.In lightweight mode, updates the single element in-place instead of appending a copy.
- Parameters:
func (str) – Function that is modifying the variable.
- extend_and_modified_by(additional_provenance, func)[source]
Extends the current provenance history with
additional_provenance.In lightweight mode, replaces the single element instead of extending.
- Parameters:
additional_provenance (Provenance) – Additional provenance history.
func (str) – Function triggering this method.
- class yaml_provenance.DictWithProvenance(dictionary, provenance, config=None)[source]
A dictionary subclass that tracks provenance for all nested values.
Features: - Recursively transforms leaf values into provenance-aware objects - Extends
__setitem__to preserve provenance history - Optionally enforces category hierarchy when configured - Extendsupdateto preserve provenance history- Parameters:
dictionary (dict) – The dictionary to wrap with provenance.
provenance (dict) – Provenance data with matching structure to
dictionary.config (ProvenanceConfig or None) – Configuration. If
None, uses the module-level default.
- get_provenance(index=-1)[source]
Returns a dictionary of provenance information with matching structure.
- Parameters:
index (int) – Index into the provenance history. Default:
-1(last/current).- Returns:
Provenance dictionary.
- Return type:
dict
- put_provenance(provenance)[source]
Recursively transforms every value into its WithProvenance object with corresponding provenance from the
provenancedict (1-to-1 mapping).- Parameters:
provenance (dict) – Provenance dict with same keys as
self.
- class yaml_provenance.ListWithProvenance(mylist, provenance, config=None)[source]
A list subclass that tracks provenance for all nested values.
- Parameters:
mylist (list) – The list to wrap with provenance.
provenance (list) – Provenance data with matching structure to
mylist.config (ProvenanceConfig or None) – Configuration. If
None, uses the module-level default.
- get_provenance(index=-1)[source]
Returns a list of provenance information with matching structure.
- Parameters:
index (int) – Index into the provenance history. Default:
-1(last/current).- Returns:
Provenance list.
- Return type:
list
- put_provenance(provenance)[source]
Recursively transforms every element into its WithProvenance object with corresponding provenance (1-to-1 mapping).
- Parameters:
provenance (list) – Provenance list with same length as
self.
Wrapper Factory
- yaml_provenance.wrapper_with_provenance_factory(value, provenance=None)[source]
Factory function that creates provenance-aware wrappers for any value type.
For subclassable types, dynamically creates a
{Type}WithProvenancesubclass. ForboolandNoneType(which cannot be subclassed), returns special wrapper instances. For types registered inconfig.custom_type_handlers, delegates to the registered handler.- Parameters:
value (any) – Value to wrap with provenance.
provenance (any) – The provenance information.
- Returns:
The value wrapped with provenance tracking.
- Return type:
object
YAML Loader
- yaml_provenance.load_yaml(filepath, category_resolver=None, config=None)[source]
Convenience function to load a YAML file with provenance tracking.
- Parameters:
filepath (str or Path) – Path to the YAML file.
category_resolver (callable or None) – Maps file paths to
(category, subcategory)tuples.config (ProvenanceConfig or None) – Configuration for provenance tracking.
- Returns:
The loaded data with provenance.
- Return type:
- class yaml_provenance.ProvenanceLoader(category_resolver=None, config=None)[source]
High-level YAML loader that produces
DictWithProvenanceobjects.- Parameters:
category_resolver (callable or None) – A function
(filepath: str) -> (category, subcategory)that maps file paths to categories. Default: returns(None, None).config (ProvenanceConfig or None) – Configuration for provenance tracking. If
None, uses module default.
- class yaml_provenance.ProvenanceConstructor(*args, **kwargs)[source]
A YAML constructor that captures provenance (line, column) for every node.
Instead of returning plain values, returns
(data, (line, col))tuples. These can then be split into a data dict and a provenance dict for use withDictWithProvenance.
YAML Dumper
- yaml_provenance.dump_yaml(config, filepath=None, stream=None)[source]
Dump a provenance-tracked config to YAML with end-of-line provenance comments.
Each scalar value is annotated with an end-of-line comment showing the source file, line, and column where the value originated. Values added programmatically (without provenance) receive a
# no provenancecomment.Output priority:
stream>filepath> stdout.- Parameters:
config (DictWithProvenance or ListWithProvenance) – The provenance-tracked configuration to dump.
filepath (str or Path or None) – Destination file path. Used when
streamis not given. If both areNone, output goes to stdout.stream (file-like or None) – An output stream (e.g.
StringIO). Takes priority overfilepath. Useful for testing or in-memory processing.
Examples
>>> from yaml_provenance import load_yaml, dump_yaml >>> cfg = load_yaml("config.yaml") >>> dump_yaml(cfg) # to stdout >>> dump_yaml(cfg, filepath="out.yaml") # to file >>> from io import StringIO >>> buf = StringIO() >>> dump_yaml(cfg, stream=buf) >>> print(buf.getvalue())
Exceptions
- class yaml_provenance.CategoryConflictError(message, key=None, old_val=None, new_val=None, category=None, old_provenance=None, new_provenance=None)[source]
Raised when two values at the same category hierarchy level conflict.
- key
The conflicting key.
- Type:
str
- old_val
The existing value.
- Type:
any
- new_val
The new value that conflicts.
- Type:
any
- category
The category at which the conflict occurs.
- Type:
str
- old_provenance
Provenance of the existing value.
- Type:
list
- new_provenance
Provenance of the new value.
- Type:
list
Helpers
- yaml_provenance.clean_provenance(data)[source]
Recursively strips provenance from data, returning plain Python objects.
- Parameters:
data (any) – Mapping or values with provenance.
- Returns:
Values in their original format without provenance.
- Return type:
any
- yaml_provenance.keep_provenance_in_recursive_function(func)[source]
Decorator for recursive functions to preserve provenance through value transformations.
The decorated function should accept
(tree, rhs, *args, **kwargs)whererhsis the value being processed. The decorator:Temporarily disables
custom_setitemonrhsif applicableRuns the function
Preserves/extends provenance from
rhsto the output
- Parameters:
func (callable) – The function to decorate.