Skip to content

Review usage of the executor_id #23977

@hzxa21

Description

@hzxa21

executor_id is constructed via actor_id << 32 | operator_id. There are 2 indications:

  1. executor_id is unique within each RW cluster. However, it is not necessarily unique across different RW clusters.
  2. After refactor: use in-memory actor info for meta operations #23528, it is possible that the same actor id can assign to different jobs after cluster recovery/restart. This means executor_id is not unique across cluster recovery/restart.

We need to review the usage of executor_id and make sure we use it correctly. For example, in starrocks sink, we are using rw-txn-{executor_id}-{ts} as the txn id, which means if there are two RW cluster sinking data to the same SR cluster, even they sink to different target table, the txn id can conflict with each other. This looks risky to me.

Also, we are using executor_id in many sinks including file, monogodb, redshift, snowflake, starrrocks, mqtt sinks. We should also review the usage outside of sink in other components as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions