Skip to content

Conversation

@jfsantos
Copy link
Contributor

@jfsantos jfsantos commented Jan 20, 2026

Since some activations now require parameters, I updated the code for get_activation to take in either a string or a JSON object. In the former case it works as previously for backward compatibility, but when passed a JSON object it will now parse the object for activation parameters and initialize them properly. This is supported for all activations that support parameters in the codebase (leaky ReLU, PReLU, leaky hard tanh).

A potentially undesired update is how PReLU handles multiple channels when it is initialized with a single parameter. I updated the code to broadcast the parameter to all channels. Right now it's being done dynamically just for testing, but if the feature is not required and we just want it to fail when the input has more channels than the PReLU activation, we can do that too.

Developed with support and sponsorship from TONE3000

@jfsantos jfsantos marked this pull request as draft January 20, 2026 20:20
@jfsantos jfsantos marked this pull request as ready for review January 20, 2026 20:29
@jfsantos
Copy link
Contributor Author

Build seems to be failing for a totally unrelated reason (the runner cannot find the "build" directory). I marked this as ready for review even though the build is not passing.

Copy link
Owner

@sdatkinson sdatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about deallocation. Will chat.

return _activations[name];
}

nam::activations::Activation* nam::activations::Activation::get_activation(const nlohmann::json& activation_config)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can live with this for now, but I'd prefer something more typed than a "mystery JSON" being the input object. Something like the model factory/registry pattern feels like it might work here.

NAM/wavenet.cpp Outdated
const int kernel_size = layer_config["kernel_size"];
const auto dilations = layer_config["dilations"];
const std::string activation = layer_config["activation"].get<std::string>();
const nlohmann::json activation_config = layer_config["activation"];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we parse this up into the "new-style" here? If it's a string, then put it into a dict under "type".

NAM/wavenet.h Outdated
, _input_mixin(condition_size, (gating_mode != GatingMode::NONE) ? 2 * bottleneck : bottleneck, false)
, _1x1(bottleneck, channels, groups_1x1)
, _activation(activations::Activation::get_activation(activation)) // needs to support activations with parameters
, _activation(activations::Activation::get_activation(activation_config)) // now supports activations with parameters
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can remove the comment :)

nlohmann::json str_activation = "ReLU";
auto act = nam::activations::Activation::get_activation(str_activation);
assert(act != nullptr);
// Don't delete global activation objects
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C] Is there something risky/tricky here w/ memory leaks?

Do we need to implement non-default destructors for objects that own activations? If we do, then that might be bad enough that I'd want to do something to avoid it.

João Felipe Santos added 3 commits January 20, 2026 16:03
…s do not have to manage memory allocated by it.
… and created a method to convert from string or JSON to ActivationConfig so parsing happens sooner.
@jfsantos
Copy link
Contributor Author

Issues raised regarding memory leaks were fixed by making all activations from get_activation a shared_ptr. Also updated the get_activation function to work with an ActivationConfig object instead of a JSON directly. ActivationConfig::from_json converts a JSON into an ActivationConfig, or fails loudly when the config is not valid.

@jfsantos jfsantos requested a review from sdatkinson January 21, 2026 00:29
class Activation;

// Strongly-typed activation type enum
enum class ActivationType
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum is good for now.

In the future, I may think about reworking this so that other libraries can build on & extend it.

ActivationType type;

// Optional parameters (used by specific activation types)
std::optional<float> negative_slope; // LeakyReLU, PReLU (single)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL std::optional. I may rework some of the other spots in the code to use this :)

unsigned long actual_channels = static_cast<unsigned long>(matrix.rows());

// Prepare the slopes for the current matrix size
std::vector<float> slopes_for_channels = negative_slopes;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried that this might not be real-time safe.


static void test_all_simple_types()
{
// Test that all simple activation types work
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Looks like we'll have to remember to come back here if we add an activation. Would be nice to have something that automatically enumerates what's there.

@sdatkinson sdatkinson merged commit d650833 into sdatkinson:main Jan 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants