Right Person

iotec Machine learning with user based classification.

Using a users auction history, we can create a rich dataset, 'profiles', that we can leverage for better performance. Combining auction data provides a more fruitful dataset to use for training of models.

Notes

Right person is powered by the spark_data_miner package which is required to run on an ec2 instance

Components

spark_data_miner

Right person relies on data mining to build it's input datasets/profiles. The data mining is powered by spark.

Creating a spark cluster is easy with right person:

>>> from spark_data_miner.cluster.manager.context_managers import spark_data_mining_session
>>> from spark_data_miner.cluster.manager.access import ClusterPlan
>>> master_instance_type = 'r5.4xlarge'
>>> node_instance_type = 'r5.8xlarge'
>>> node_count = 100
>>> plan = ClusterPlan(master_instance_type, node_instance_type, node_count)
>>> with spark_data_mining_session(plan=plan) as session:
...     # do work using the session

Clusters require an AMI to be build in order to function. To build a suitable AMI compatible with right_person:

$ build_right_person_ami

and if an appropriate AMI doesn't exist an exception will be raised.

Data miners have a config to specify the format of the data in the profiles. An example of a miner config looks like this:

>>> from spark_data_miner.core.config import MinerConfig, MinerField
>>> config = MinerConfig(
...     name='document name',
...     delimiter=',',
...     fields = [MinerField('field_name', [0], 'str', 'counter')],
...     id_field='user id field',
...     files_contain_headers=True,
...     s3_bucket = 'bucket',
...     s3_prefix = 'prefix_with_date_%Y-%m-%d',
... )

Miners are incredibly simple to use:

>>> from spark_data_miner.cluster.manager.context_managers import spark_data_mining_session
>>> from spark_data_miner.core.miner import SparkDatasetMiner
>>> with spark_data_mining_session(plan=plan) as session:  # plan defined above
...     miner = SparkDatasetMiner('name', config, 'output_bucket')  # config defined above
...     miner.create_dataset(session)

Models

The right profile models are Logistic regression models. All models are stored in the iotec labs API (https://api.ioteclabs.com/rest/) Interfacing with the models is easy:

>>> from right_person.models.core import RightPersonModel
>>> model = RightPersonModel('example_model_name', 'account')
>>> model.partial_fit([{'good_example': True}, {'bad_example': True}], [1, 0])
>>> model.predict({'good_example': True})  # returns a number between 0 and 1

Models can be stored in the api like so:

>>> from right_person.models.store import RightPersonStore
>>> from right_person.stores.model_stores import S3RightPersonModelStore
>>> from right_person.data_mining.cluster.context_managers import right_person_cluster_session
>>>
>>> store = RightPersonStore()
>>>
>>> # list models
>>> models = store.list(as_list=True)  # if as_list kwarg is ommited, the return type is a generator
>>>
>>> # retrieve models
>>> model = store.retrieve(models[0].model_id)
>>>
>>> # create models
>>> del model.model_id
>>> new_model = store.create(model)
>>>
>>> # update models
>>> old_model = models[-1]
>>> old_model.new_name = 'new_name'
>>> store.update(old_model)

Usage

Installation

You need to configure the iotec package repository to pip install right_person. The package repository is hosted at pypi.dsp.io

$ pip install right_person

Alternatively, you can install the package from git. The production versions ship without tests.

Name		Name	Last commit message	Last commit date
Latest commit History 272 Commits
jenkins		jenkins
right_person		right_person
spark_data_miner		spark_data_miner
test		test
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Right Person

Notes

Components

spark_data_miner

Models

Usage

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Right Person

Notes

Components

spark_data_miner

Models

Usage

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages