Hi there,
First of all, thanks for creating this amazing library - investing time and money to support it, and making it open source. I discovered it at JuliaCon 2021 and have been immediately fascinated by it.
Question: Is there interest in supporting text representation (e.g., using ScikitLearn terminology: CountVectorizer, TfidfVectorizer, NgramVectorizer, etc) features in FeatureTransforms.jl?
Some of them are covered in TextAnalysis but are quite different, from the user perspective, from what anyone would expect looking at ScikitLearn. In particular, they are not "pipe-able" nor do they immediately return what is interesting for the user (i.e., do not implement the common "fit_transform" paradigm). In other words, it's quite non-trivial getting from text to input representations that can be fed to a machine learning model (e.g., MLJ, Flux, etc).
FeatureTransforms.jl seems a good place to support these transforms and, if interesting, I'd be happy to work on adding support for them.
Hi there,
First of all, thanks for creating this amazing library - investing time and money to support it, and making it open source. I discovered it at JuliaCon 2021 and have been immediately fascinated by it.
Question: Is there interest in supporting text representation (e.g., using ScikitLearn terminology: CountVectorizer, TfidfVectorizer, NgramVectorizer, etc) features in FeatureTransforms.jl?
Some of them are covered in TextAnalysis but are quite different, from the user perspective, from what anyone would expect looking at ScikitLearn. In particular, they are not "pipe-able" nor do they immediately return what is interesting for the user (i.e., do not implement the common "fit_transform" paradigm). In other words, it's quite non-trivial getting from text to input representations that can be fed to a machine learning model (e.g., MLJ, Flux, etc).
FeatureTransforms.jl seems a good place to support these transforms and, if interesting, I'd be happy to work on adding support for them.