Peace: Zero Stress Automation - The Peace Framework

Peace

Zero Stress Automation

Azriel Hoh

February 2025

Building `envman` in Nushell

# in $nu.config-path
# usually C:\Users\$env.USER\AppData\Roaming\nushell\config.nu
def envman_demo_prepare_release [] {
    cargo envman_build_release
    let envman_demo_dir = [$env.TEMP demo envman] | path join
    if not ($envman_demo_dir | path exists) {
        mkdir $envman_demo_dir
    }

    let envman_exe_path = [target release envman.exe] | path join
    cp -f $envman_exe_path $envman_demo_dir
    echo $"Copied ($envman_exe_path) to ($envman_demo_dir)"

    let envman_pkg_dir = [target web envman pkg] | path join
    cp -f --recursive $envman_pkg_dir $envman_demo_dir
}

# then
envman_demo_prepare_release
cd ([$env.TEMP demo envman] | path join)

Notes

Heya everyone, my name is Azriel, and today I'll be showing you my automation side project, called Peace.
Peace is a framework to create zero stress automation.
As with every side project, there is an origin story.
In my first job, a team of us were given the task of fully automating our solution deployment.
So this was the deployment process, it's all manual.
and, obviously this is the solution. Genius.
We wanted this process: click, wait, done.
Being good engineers, we aimed for:. (linking: Dimensions)
End-to-end automation.
Repeatable correctness.
and Performance.
and we delivered!
If you measure success using these metrics, it was undeniable.
We reduced the deployment duration down from 2 weeks, to 30 minutes.
However, our users said "we hate this!". Azriel, this doesn't enhance our lives.
What we really did when we introduced automation, was this.
When switching from manual steps to automation, the work changes from doing each step at your own pace, to setting up the parameters for all of the steps, pressing go, and waiting.
If it didn't work, then you had to clean up and start again.
And for people who were unfamiliar with the process, it was especially painful:
We're telling them to fill in parameters that they don't understand,
to feed into a process that they cannot see,
to create an environment, that they cannot visualize.
So they may not have understood what they were doing, but it was certainly our fault.
We created pain.
We had engineering eyes, but not human eyes:
We took away understandability and control.
And when you take those away, you inadvertently also take away morale. What little they had.
Ideally we should have built something that provides the value of automation,
while retaining the value of manual execution.
This is what the Peace framework aims to do.
Normally when we write automation, we spend just enough effort to get things working.
I wanted a framework which, by spending that same amount of effort, I would get a much nicer tool.
So today I'd like to show you envman, a tool built using the Peace framework.
envman downloads a web application from github, creates some resources in Amazon, and uploads the web application.
Notably there's a missing step to launch a server that runs the web application, but that's not free.
The first thing we took away was understandability, so let's put that back.
There are two ways we tend to write automation:
Either we produce too little information, and we can't tell what's going on,
or, we produce too much information, and we still can't tell what's going on.
For understandability, we need to have something in between.
This is what it looks like when you have too little information:
clear, ./envman deploy --format none, clean.
This is what it looks like when you have too much information:
./envman clean --format json, clean.
The right balance is somewhere in between.
clear, ./envman deploy.
How many steps are there in this process? gesture
Did it work?
"Green means good", so of course!
What resources were created? gesture
And if we clean up the environment, you'll see a similar interface, so you can tell that each resource is deleted: ./envman clean.
That's all good when things go well, but what happens in a failure? Can we understand it?

First we'll limit the connection speed of the tool to 40 kilobytes per second:

New-NetQosPolicy `
  -Name "envman" `
  -AppPathNameMatchCondition "envman.exe" `
  -PolicyStore ActiveStore `
  -ThrottleRateActionBitsPerSecond 40KB

and run the deployment again: clear, ./envman deploy.
You can see that our download from github has slowed,
and in a little while we should see an error happen.
Here we go.
Can you see which step went wrong?
Red means bad, so that one.
In detail, what went wrong, why it went wrong, and how to recover, are all shown.
We failed to upload the object. Why? The upload timed out, and make sure you are connected to the internet and try again.
We're also shown which resources exist and which don't, so we don't have to guess.

If we fix our connection, and re-run the automation:

Remove-NetQosPolicy `
  -Name "envman" `
  -PolicyStore ActiveStore `
  -Confirm:$false

You'll see that it picks up where it left off, and completes the process.
What you think it should do, it does. No surprises.
So in summary, with information, the goldilocks principle applies:
Too much information is overwhelming, too little is not useful, and there's some middle ground which is just right.
The Peace framework generally tries to fit the most relevant information on one screen.
The second thing we took away, was control.
Most automation tools give you one button -- start -- and that's it.
While pressing start is not difficult, knowing whether the automation will do, what we think it will, is difficult.
What we should understand before starting anything, is:
Where we are -- our current state,
Where we want to go -- our goal state, and
The distance between the two.
Then we can press start.
But when we press start, and change our mind, can we stop the process?
Without automation, we can.
With automation, while pressing Ctrl C on a command line tool is one form of interruption,
what we really care about, is safe interruption.
If we can interrupt the process, adjust the parameters, press go, and have the automation pick up where it left off, that would be great.
And don't undo all of the work you did to get to this point.
Let's see all of this control, in action.
Before we run our deployment, what is the current state? ./envman status.
Target state? ./envman goal
What's the difference? ./envman diff
And for interruptibility, when we deploy, we'll stop the process halfway.
./envman deploy, ctrl c.
Here you can see steps 1 through 3, and step 5 were complete,
and step 4 and 6 were not started due to the interruption.
If we look at the diff: ./envman diff,
you can see that steps 1, 2, 3, and 5 are done, steps 4 and 6 haven't been executed.
If we change our parameters, to using version 0.1.2 instead of 0.1.1 of our web application,
the diff will now show that step 1 will change.
And if we run deploy again, that is exactly what happens.
When cleaning up, we can also interrupt the process.
Steps 1, 4, 5, and 6 were cleaned, and 2 and 3 were not.
And we can choose to either deploy the environment again, or clean up fully.
Let's deploy it to completion. deploy, clean.
Morale.
Not everyone who uses automation tools has a software background, and not everyone uses the command line all the time.
So why not create something that caters for these situations as well?
Back to understandability, normally when explaining what automation does,
we tend to draw a diagram on the whiteboard,
or create a diagram in an internal documentation site.
However, it's never really accurate, and it's usually a tangle of overlapping boxes and lines,
so it is hard to understand, because the information isn't clear.
So here's a web interface. ./envman web
Based on the code written for your automation, two diagrams are generated:
The one on the left is called the Progress diagram, which shows the steps in your process,
and the one on the right is the Outcome diagram, which shows what the deployed environment looks like, before you deploy it.
By clicking on these steps on the right, we get to see what is happening in that step.
All of this is generated from your automation code. Magic.
This is what you can use to teach someone, or self learn, what the automation process is, and what the environment looks like.
And you don't have to keep erasing and redrawing lines on the whiteboard.
Which step was unclear? This one? Let's go through that again.
Now, this is great, but I like this one.
The diagram you saw is the example environment, but what does the actual environment look like?
We can discover it.
The diagram on the right has faded boxes for each resource, indicating that it doesn't exist.
When I click deploy, you can watch the progress diagram on the left, which will show you which steps are being executed,
or you can watch the outcome diagram on the right, which will show you the interactions between hosts, that are happening in real time.
All of the resources have been created, so they are now visible.
When we clean up, the boxes in the diagram become faded again.
And if we were to have an error, as we did before, we should see it clearly.
slow down internet, click deploy
Hey look it's gone red.
So very quickly, from the user interface, you can tell which step the error came from,
as well as which resources it involves.
And we can surface the timeout message on the web interface, I just haven't coded that part yet.
Cool.
So for morale, a lot of effort has been put into aesthetics.
For seeing the state of the system, showing one line for each resource, with a link to the full detail, is deliberate.
If you've ever been on-call and gotten a call out in the middle of the night, it's very annoying to have to go and find each resource that is part of the system you are investigating.
If I can think it, take me there.
For progress, we present the information at a level of detail that is digestable,
and for errors, instead of panicking, which is visually equivalent of printing a stack trace,
we take that error, refine it, and make it beautiful.
Always include what went wrong, the reason, and how to recover,
because when help people recover from a bad situation,
you recover their morale.
With all of these aesthetic refinements, that box, is no longer opaque.
It is completely, clear.
You can see inside it, you can understand it, and you can control it.
How does all of this work?
Magic.
Architecture, how does it fit together?
The Peace framework is categorised into two main parts.
The item definition, which is the common shape of logic and data, for anything that is managed by automation, and
Common functionality, which works with those items to provide command execution and a user interface.
Item crates contain the logic and data to automate one thing, and
a tool is the thing that connects items together and passes them to the Peace framework.
These groupings are deliberate, so that you can share and reuse common automation logic,
while keeping proprietary values and workflows within your tool.
Let's go deeper.
If you think of one step in a process, normally we would write code to do the step.
But instead of only writing code that does work,
we also have functions that fetch information about the step.
What is the current state of the thing I'm managing?
What will it be, after the automation logic is executed?
What's the difference between these states?
What does it look like if it's not there?
Essentially, functions to show me what it is and what it will be, without changing anything.
A collection of functions is called an Item.
And a collection of items, is called a Flow.
And a flow also contains the dependency ordering between items.
Then this flow is passed to Peace's to execute or display information.
Commands. Commands are one of the common functionality that Peace provides.
Given a flow and parameters, it invokes different functions within each item.
For example, the Discover command will run these functions, store the state, and display it to the user.
The Diff command will compute and show the difference between the current and goal states of each item.
The Ensure command will turn the current state of each item, into its goal state, through the apply function.
The Clean command is similar, where it turns the current state into the clean state, also through the apply function.
So Peace provides common logic to iterate through the items, and call the appropriate functions.
and it will also pass the appropriate values between each item.
That, is magic.
Putting it all together:
We combine the items into a flow,
We specify the parameters for each item,
Pick an output -- the command line, or web, or both,
and call the right command.
Surface the commands to the user with appropriate names,
and this is your tool.
Now rounding off, what's the status of Peace? Is it ready to be used?
For development workflows, or short lived environments, where the environment does not live longer than one version of a tool,
It's usable.
But for production workflows, or environments that need to be stable, then Peace is not ready.
Don't use it, you will not have Peace.
Links to the project:
peace.mk for the project website
Slides are on peace.mk/book.
github.com/azriel91/peace for the repository.
To wrap up, I'd like to end with this note:
To engineer with empathy,
whether it is verbal, visual, or vocal,
refine your voice, connect,
and communicate with clarity.
Thank you for listening, and I'm happy to take questions.

The Peace Framework

Building envman in Nushell

Notes

Building `envman` in Nushell