Skip to content

Attune Slack integration #4

@meson800

Description

@meson800

This is comprised of two parts:

  • Server-side state-machine that interacts with the actual Slack API and updates the home tab with the current (live) status, and sends updates as requested.
  • Client-side state machine that handles reading the logs generated by the Attune software and talks to the server.

I would start first with the client-side state machine. Example logs are already in instruments\data\attune\logs; starting with parsing those would be a good first step.

Client-side

Proposed route

  1. Open up a single log file, build regexes to parse relevant data out of it. Test that your parser is robust to the types of lines that appear.
  2. Map out the "state machine". By state machine, we typically mean tracking the current status of the system; at any one point of time, the internal state would be something like "idle, on" or "running, 50% done", or "shutting down", or "performance test", etc. What we mean by state machine is then how we think about transitions between these states and how we handle them as new lines appear in the log files. For example, the appearance of a user login should transition us from a "logged out" to a "logged in state", if a certain type of line appears in the log file. In this case, user login/logout status is orthogonal to the Attune status (e.g. you can be logged out and running something, or logged in and idle, or logged out or idle, they are independent). A simple state machine here could look like:
state = ...
for line in logfile:
    old_state = state
    if line(matches procedure start conditions):
        state = StartState.(procedure_name)
    if (...):
    ....
    if state is not old_state:
        update_slack_state(state)
  1. Write some tests! Introducing pytest is probably best done in person, not in long-form here, but the long story short is how do we know that the state machine is working properly? After you do get it working, how are we sure that future edits aren't introducing incorrect behavior? The answer to both is testing/unit testing. The idea is we write short test cases, and hook it into this framework that automatically tests your code. This means you can be assured that you didn't break something in the future as long as your test cases pass.
  2. Make sure that the attune's 'log rotation' is handled properly. Normally when software writes logs files, there is a cap on the file size (so you don't get multiple hundred megabyte log files). Typically what happens is the log "rotates" , e.g. blah.log becomes blah.log.1 and the software starts writing on a blank blah.log. In this case, it doesn't look like the logs explicitly rotate for space, but are instead created every time the software is opened (because they are timestamped with date/time). Make sure we track these changes!
  3. Integrate into the server-side info API.

Things to think about/use

  1. We can't just open the file and read everything in it; we have to wait and do something on each line as it gets written. This could be done manually (read until you reach end of file, EOF, then store that offset, then check again in N seconds, starting at the offset), but there are also python packges to do this, one that I found quickly is pygtail. It lets you start an infinite while loop, waiting as log lines get written. There are other solutions though as well (search terms should be 'tailing' a file, because on linux/mac, you can use the command tail -f to follow a file as it is getting written to.
  2. How do we handle interruptions?
  3. Don't make it fully blocking (e.g. the entire program is just waiting on log lines). It will be helpful to have the client send "heartbeats" to the server every few seconds, so we can distinguish between some problem in the client vs the client is working but the Attune software isn't writing log files vs the computer is just off. Heartbeats mean that the status is not stale.

Server-side

  1. (Depends on @meson800's subproject): decide on an interface to show people. This is mostly an entry in the home tab, plus alerts if the Attune is in a bad state.
  2. Add the relevant Slack interactions into a Bolt interface that is hooked into the rest of labbot.
  3. Decide on a simple (FastAPI driven) API of our own, to receive information from the client.

Extensions

If we store the state machine transition times, we can actually start giving estimated times on procedures/shutdowns. Having a nice countdown in the Slack interface and also in the Attune software could be cool.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions