Skip to content

Benchmark results for Stream-VAD and AED models? #12

@LattifaiHQ

Description

@LattifaiHQ

Hi, great work on FireRedVAD! 🎉

The README provides comprehensive benchmark results for the VAD (non-streaming) model on FLEURS-VAD-102, showing impressive SOTA performance (F1: 97.57%, AUC-ROC: 99.60%).

However, I noticed that the benchmark results for the other two models are not included:

  1. Stream-VAD — Are there comparable benchmark results on FLEURS-VAD-102 or other datasets? Specifically, how does the streaming model compare to the non-streaming VAD in terms of F1, AUC-ROC, and latency?

  2. AED — Since it detects three event types (speech, singing, music), are there benchmark results on audio event detection datasets? What metrics were used to evaluate the multi-class detection performance?

It would be very helpful for users to understand the trade-offs between the three models when choosing which one to use.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions