Staged Processing

# Motivation

Many privacy-enhancing transformations require multiple stages. Generalizing attributes for example requires us to define a generalization hierarchy first. Then, in a second step we can apply this hierarchy to the data items. This requires us to process items in stages.

## Examples:

- Generalization hierarchy:
  - Stage 1:
    - Analyze value distribution in items.
  - Stage 2:
    - Generalize items with the given distribution.

- k-Anonymity:
  - Stage 1:
    - Analyze attribute frequencies.

# Implementation Proposal

To enable such staged processing, we plan to make the following additions to the Kodex stream processing mechanisms:

- Add a numerical `stage` attribute to the `Config` model.
- Add a `Batch` model that stores information about the processing of a given stage for a number of items. 
- Add an internal buffering mechanism (using internal channels) that enables us to buffer items for multi-stage processing.
- Make the `group store` functionality currently implemented in the anonymization/aggregation action available to all actions as a means to perform distributed, parallel computation on data items.
- Change the scheduler to enable staged processing of data items.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Staged Processing #1

Motivation

Examples:

Implementation Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Staged Processing #1

Description

Motivation

Examples:

Implementation Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions