API Design

Designed around 2024-07-13

Requirements

  1. Be able to render a diagram, with/without the item existing.
  2. Framework should be able to determine if an ItemLocation from B is:
    1. The same as an ItemLocation from A.
    2. Nested within an ItemLocation from A.
    3. Completely different.

Information Sources

Known information before / during deployment:

  1. Flow edges -- Edge::Contains / Edge::Logic. Though maybe we want to reduce this to just Edge::Logic.
  2. Flow item params that are specified.

Missing information before / during deployment:

  1. State generated / produced / discovered through execution.
  2. Parameters calculated from state.

Desired Outcome

What it should look like.

  1. Item returns a Vec<ItemInteraction>, where an ItemInteraction denotes a push/pull/within for a given source/dest.
  2. We need Item implementors to render a diagram with ParamsPartial, or some generated.
  3. However, partial params means Item implementors may not have the information to instantiate the ItemLocation::{group, host, path}s for the Item.

Single Item Example

With values

From this item:

items:
  file_download: ! FileDownloadParams
    src: "https://example.com/file.zip"
    dest: "./target/abc/file.zip"

One of these diagrams is generated:

Which one is generated depends on how easy/difficult it is to figure out which node ports to use -- we don't necessarily know the node positions.

Also, we may not necessarily use dot, but a div_diag.

Without values

items:
  file_download: { src: "??", dest: "??" }

Multiple Items

With values

From these item params:

items:
  app_file_download: !FileDownloadParams
    src: "https://github.com/my_work_org/app/app_v1.zip"
    dest: "./target/customer/app_v1/app_v1.zip"

  config_file_download: !FileDownloadParams
    src: "https://my_work_org.internal/customer/app_v1/config.yaml"
    dest: "./target/customer/app_v1/config.yaml"

  app_file_upload: !S3ObjectParams
    file_path: "./target/customer/app_v1/app_v1.zip"
    bucket_name: "my-work-org-12345-ap-southeast-2"
    object_key: "/customer/solution/app/app_v1.zip"

  config_file_upload: !S3ObjectParams
    file_path: "./target/customer/app_v1/config.yaml"
    bucket_name: "my-work-org-12345-ap-southeast-2"
    object_key: "/customer/solution/app/config.yaml"

source

Without values

TODO

Learnings

State (Values) For Non-Existent Items

Options:

  1. 🟡 A: Always get item implementors to return a value, but tagged as unknown or generated.

    Pros:

    1. 🟢 Can always generate a diagram and show what might be, even if we don't actually have values to do so.

    2. 🟢 Whether using dot or FlexDiag, the diagram layout will more consistent from the clean state to the final state if nodes are styled to be invisible, rather than not present. Cons:

    3. 🔴 What is generated can depend on input values, and if we have fake input values, we may generate an inaccurate diagram.

    4. 🟡 Choosing a default example value of "application version X" may cause a subsequent item to fail, because the application version doesn't exist.

      Item implementors would be constrained to not make any external calls.

      1. If we made every parameter value tagged with !Example vs !Real, then it can avoid this problem, but it is high maintenance on the item implementor.
      2. Maybe we pass in an additional parameter in Item::apply_dry so it isn't missed.
      3. Any !Example value used in a calculation can only produce !Example values.
      4. 🔵 If a dry run is intended to detect issues that would happen from an actual run, then we do want external systems to be called with the parameters passed in. e.g. detect credentials that are incorrect.
      5. 🔵 If a dry run is NOT intended to detect issues, then we will have simpler code implementation -- never make any external system calls (network, file IO).
    5. 🟡 Choosing a default cloud provider in one item, could make a nonsensical diagram if another item uses a different cloud provider.

      Note that an Item may still generate sensible example state based on example parameter values.

    6. 🔴 Code has complexity of either another derive macro generated type with RealOrExample<T> for each state field, or a wrapper for RealOrExample<Item::State>.

  2. 🟡 B: Add Item::state_example, which provide fake state.

    Pros:

    1. 🟢 Can always generate a diagram and show what might be, even if we don't actually have values to do so.

    2. 🟢 Whether using dot or FlexDiag, the diagram layout will more consistent from the clean state to the final state if nodes are styled to be invisible, rather than not present. Cons:

    3. 🔴 Even more functions in the Item trait, creating more burden on item implementors.

    4. 🟡 Choosing a default cloud provider in one item, could make a nonsensical diagram if another item uses a different cloud provider.

      Note that an Item may still generate sensible example state based on example parameter values.

  3. 🟡 C: Return Option<T> for each value that is unknown.

    Pros:

    1. 🟢 Never have false information in diagrams.

    2. 🟢 Code can put None for unknown values. Cons:

    3. 🔴 Unable to generate useful diagram when starting from a clean state. i.e. cannot visualize the fully deployed state before deploying anything.

    4. 🔴 Code has complexity of another derive macro generated type with Option<T> for each state field.

Let's go with B.

Notes From Designing Diagrams

  1. The node ID must be fully derivable from the ItemLocation / parameter / state, i.e. we cannot randomly insert a middle group's name in the middle of the ID.
  2. The node ID must be namespaced as much as possible, so two same paths on different hosts / groups don't collide.
  3. The node ID must NOT use the flow's graph edges (sequence / nesting) in its construction. This is because flows may evolve -- more items inserted before/after, and the node ID should be unaffected by those changes.