Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions clients/attune/attune_log_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import os
import csv
import re

os.getcwd()
os.chdir('test_logs/')

log_file = 'System_2020-11-10_18-52-23.log'
#Define regexes

User_dictionary = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: use snake_case, e.g. no uppercase U. This is true for everything except type definitions (e.g. if this was a class).

'test' : 'test1'
}
Comment on lines +12 to +13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On how to load this user dictionary, the normal recommendation is to load 'secret' or external data either through environment variables or external files. I would tend to prefer external files for something like this. I would use a JSON file for this.

JSON data is very similar to the Python syntax, with some specific differences (like only double-quoted strings). After formatting your user dictionary as a separate JSON file (you can use websites like https://www.jsonformatter.io/)

{
   "some_long_uid": "cjohnsto",
   "some_other_uid": "abeitz"
}

then you can load this dictionary with:

import json

with open('test.json') as json_file:
  user_dictionary = json.load(json_file)

This also clearly separates the code from data. We can also just never commit the json file to source control, so it never appears publicly. This approach is currently being used on the server-side of labbot.


event_done = re.compile('^Event:(?P<event>[^_]+)_Done')
function_in_process = re.compile('^Instrument function (?P<state>.+) executing line (?P<line>\d+)$')
event_start = re.compile('^Event:Starting_(?P<event>.+)$')
Comment on lines +15 to +17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it affects any of your regexes here, but prefer 'raw strings', e.g. by prefixing the string with an r (r'^Event blah blah' instead of '^Event blah blah). Making it a raw string ensures that your regex doesn't get modified during parsing.

If you're using Regex101 for your regex debugging, it actually hints at this; if you have it in Python regex mode, the top-string will have a little r denoting a raw string.

Normal strings expand escape sequences by default, e.g. they expand \n into a newline, \t into a tab character, and so on. Similarly, format strings in Python recognize curly braces in a special way. These types of characters may be in regexes, so by making it a raw string, we disable any of this expansion, meaning that the regex stays as desired.

user_login = re.compile('login')
bubble_error = re.compile('^Data:New Bubble Detected')
big_bubble = re.compile('BUBBLE SIZE GREATER THAN THRESHOLD!!!')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your code, I really love that this is the actual error message. Definitely feels like a frustrated hardware engineer at some point. The triple exclamation point really sells it.

aquisition_well = re.compile('^Acquisition initiated on well (?P<well>.+) Preload')
aquisition_tube= re.compile('^Acquisition initiated$')


with open(log_file) as log:
csv_log = csv.DictReader(log, delimiter=',')
csv_log.fieldnames = [
'TimeStamp',
"LogType",
"User",
"Category",
"Message",
"unclearNumberString" ]
Comment on lines +29 to +33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: prefer single-quote strings ('hello world') over double-quote strings ("hello world"). Really, just makes sure you are being consistent instead of switching between the two.


for line in csv_log:
Comment on lines +26 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to change how this parsing works later, but this is good code!



result=event_start.match(line["Message"])
if result != None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a variety of reasons, prefer if result is not None instead of if result != None and if result is None instead of if result == None.

See https://stackoverflow.com/a/2209781 for details.

function = result.group("event")
State = "Active"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: lowercase S in state.

Also, this is a very Matlab-y way to initialize variables. It is fine that you are creating State inside the for loop and it does also work in Python, but it's not generally good practice. What happens if all of the if statements fail while parsing the entire file? Then the final print line will fail because State is undefined! It's good practice when doing something like this to explicitly declare state = None or some other sentinel, then check if it is not None at the end to handle this case.


result=function_in_process.match(line["Message"])
if result != None:
function = result.group("state")
lines = result.group("line")

result=event_done.match(line["Message"])
if result != None:
function = result.group("event")
State = "done"

#result=user_login.match(line["Message"])
#if result != None:
# User_ID = line["User"]
# User = User_Dictionary[User_ID.lower()]

result_well = aquisition_well.match(line["Message"])
if result_well != None:
last_well = result_well.group("well")
print(last_well)




result_tube = aquisition_tube.match(csv_log.__next__()["Message"])
if result_tube != None and result_well == None:
last_well = "tube"
print(last_well)





#result=bubble_error.match(line["Message"])
#if result != None:
#print('Bubble Error on well', last_well, '!')


print(function, lines, State)