fix: power brownout and dirty write flush on restart or shutdown#2627
Open
NickDunklee wants to merge 2 commits into
Open
fix: power brownout and dirty write flush on restart or shutdown#2627NickDunklee wants to merge 2 commits into
NickDunklee wants to merge 2 commits into
Conversation
This is probably another "needs soaking" fix as it touches power.
Backstory on this one:
I noticed the sensor firmware build was aggressively sending
"Battery is low" messages constantly when a RAK19007 was below
50%. (These messages only show up in third party clients to all
node admins, as the stock MeshCore mobile client doesn't let
one see messages from a sensor node. Seems another power draw
sending that message, but not part of this PR.)
Then people on the local mesh have been on and off talking
about certain nodes randomly losing their contact lists on
some node types, and others were talking about Heltec V4
brownouts. I also observed Heltec v4 die prematurely around
50% and I started thinking they were all related.
Started digging into the code and found a few potential leads:
- MeshCore does a "lazy" write on `dirty_contacts_expiry`
in a 5 second window.
- The shutdown/restart path do not clean this up
- Low battery check is a poll every 8 seconds with
no awareness of other things going on in the node
**On the power piece:**
Heltec V4 and other higher-powered nodes can hit the battery
harder when transmitting, below 50%, lithium batteries sag
more dramatically than they do at higher charge states.
If the power check happens at the same time as transmit,
the shutdown code gets called prematurely and shuts down
the node.
**On the file write piece:**
If the shutdown or restart paths are called, the code just calls
`shutdown()` or `reboot()` without checking and calling
`saveContacts()`. There do not appear to be any other file writes
that act this way.
**The Fix**
The change is kept using AUTO_SHUTDOWN_MILLIVOLTS so it respects
previous power threshold decisions across all node types.
With this change, all restart or shutdown paths will make
sure to call `saveContacts()` before shutting down to stop
the list from becoming corrupted.
It also suspends reading battery level for 250ms during transmit
(adjustable) so a power sag doesn't trigger an early shutdown.
On Heltec V4 at least, the MeshCore software power threshold is
much higher than the board's internal brownout/shutdown threshold.
**Tested on**
- Heltec v4
- RAK 19007
- Heltec T096
- RAK 19003
On the Heltec v4, I can now pass 50% and get down to 36% before it shuts
down. Although the voltage at 36% should probably actually say 5%
[based on some voltage curve sites like this one](https://voltagebasics.com/lithium-polymer-battery-voltage-chart/).
That is probably an idea for future mobile app improvements, the MCU
temp and battery voltage could be calculated in the app itself to generate
the battery percent and it would likely seem a bit more "accurate"
on all board types without having to add math in the node code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is probably another "needs soaking" fix as it touches power.
Backstory on this one:
I noticed the sensor firmware build was aggressively sending
"Battery is low" messages constantly when a RAK19007 was below
50%. (These messages only show up in third party clients to all
node admins, as the stock MeshCore mobile client doesn't let
one see messages from a sensor node. Seems another power draw
sending that message, but not part of this PR.)
Then people on the local mesh have been on and off talking
about certain nodes randomly losing their contact lists on
some node types, and others were talking about Heltec V4
brownouts. I also observed Heltec v4 die prematurely around
50% and I started thinking they were all related.
Started digging into the code and found a few potential leads:
dirty_contacts_expiryin a 5 second window.
no awareness of other things going on in the node
On the power piece:
Heltec V4 and other higher-powered nodes can hit the battery
harder when transmitting, below 50%, lithium batteries sag
more dramatically than they do at higher charge states.
If the power check happens at the same time as transmit,
the shutdown code gets called prematurely and shuts down
the node.
On the file write piece:
If the shutdown or restart paths are called, the code just calls
shutdown()orreboot()without checking and callingsaveContacts(). There do not appear to be any other file writesthat act this way.
The Fix
The change is kept using AUTO_SHUTDOWN_MILLIVOLTS so it respects
previous power threshold decisions across all node types.
With this change, all restart or shutdown paths will make
sure to call
saveContacts()before shutting down to stopthe list from becoming corrupted.
It also suspends reading battery level for 250ms during transmit
(adjustable) so a power sag doesn't trigger an early shutdown.
On Heltec V4 at least, the MeshCore software power threshold is
much higher than the board's internal brownout/shutdown threshold.
Tested on
On the Heltec v4, I can now pass 50% and get down to 36% before it shuts
down. Although the voltage at 36% should probably actually say 5%
based on some voltage curve sites like this one.
That is probably an idea for future mobile app improvements, the MCU
temp and battery voltage could be used to calculate the battery percentage
in the app itself and it would likely seem a bit more "accurate"
on all board types without having to add math in the node code.