Skip to content

Commit 3f0ec83

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent f0e2151 commit 3f0ec83

File tree

1 file changed

+9
-6
lines changed

1 file changed

+9
-6
lines changed

machine_learning/q_learning.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
"""
2-
Q-Learning is a widely-used model-free algorithm in reinforcement learning that
3-
learns the optimal action-value function Q(s, a), which tells an agent the expected
2+
Q-Learning is a widely-used model-free algorithm in reinforcement learning that
3+
learns the optimal action-value function Q(s, a), which tells an agent the expected
44
utility of taking action a in state s and then following the optimal policy after.
5-
It is able to find the best policy for any given finite Markov decision process (MDP)
5+
It is able to find the best policy for any given finite Markov decision process (MDP)
66
without requiring a model of the environment.
77
88
See: [https://en.wikipedia.org/wiki/Q-learning](https://en.wikipedia.org/wiki/Q-learning)
@@ -85,8 +85,10 @@ def update(state, action, reward, next_state, next_available_actions, done=False
8585
0.5
8686
"""
8787
global LEARNING_RATE, DISCOUNT_FACTOR
88-
max_q_next = 0.0 if done or not next_available_actions else max(
89-
get_q_value(next_state, a) for a in next_available_actions
88+
max_q_next = (
89+
0.0
90+
if done or not next_available_actions
91+
else max(get_q_value(next_state, a) for a in next_available_actions)
9092
)
9193
old_q = get_q_value(state, action)
9294
new_q = (1 - LEARNING_RATE) * old_q + LEARNING_RATE * (
@@ -173,5 +175,6 @@ def run_q_learning():
173175

174176
if __name__ == "__main__":
175177
import doctest
178+
176179
doctest.testmod()
177-
run_q_learning()
180+
run_q_learning()

0 commit comments

Comments
 (0)