idq.classifiers¶
- class idq.classifiers.ClassifierModel(start, end, segs=None, model_id=None, generate_id=False)[source]¶
a parent class that defines some basic attributes that all trained models must have to track data provenance each classifier will likely extend this class for their own purposes
- feature_importance_figure(dataset, start, end, t0, **kwargs)[source]¶
generate and return a figure demonstrating the feature importance based on the data within dataset; should return a figure object.
- feature_importance_table(dataset, **kwargs)[source]¶
should return (columns, data) compatible with the DQR’s json.format_table (see use in idq/reports.py)
- property hash¶
the identifier used to locate this model.
- class idq.classifiers.IncrementalSupervisedClassifier(nickname, rootdir='.', model_id=None, **kwargs)[source]¶
An extension of SupervisedClassifier that is meant to re-train itself incrementally instead of a series of batch jobs (starting from scratch) should be able to inherit much of the functionality from SupervisedClassifier
- class idq.classifiers.SupervisedClassifier(nickname, rootdir='.', model_id=None, **kwargs)[source]¶
a parent class for classifiers. Children should overwrite methods as necessary. This classifier will support everything required syntactically for the pipeline to function, but will assign random ranks to all events.
- calibrate(dataset, **kwargs)[source]¶
calibrate this algorithm based on the dataset of feature vectors. requires all FeatureVectors in the dataset to have been evaluated This should update self._calibration_map
- evaluate(dataset)[source]¶
This classifier assigns random ranks to all events independent of training data set. data should have the shape (Nsamples, Nfeatures) return an 1D array with length Nsamples representing the ranks assigned to each sample in data
WARNING: this needs to be highly efficient if we’re to use it to build time-series!
- feature_importance()[source]¶
return a ranked list of important features within the trained model will raise an UntrainedException if we do not have a trained model stored internally
- feature_importance_figure(*args, **kwargs)[source]¶
generate and return a figure demonstrating the feature importance based on the data within dataset factory; should return a figure object.
- feature_importance_table(*args, **kwargs)[source]¶
should return (columns, data) compatible with the DQR’s json.format_table (see use in idq/reports.py
- property flavor¶
this is a “private” variable because I don’t ever want a user to muck with this. I also want each child to have to declare this for themselves. this should be considered like a “type” but may be easier to deal with a string instead of a Type object
- property nickname¶
this is a “private” variable because I don’t ever want a user to muck with this once it is set upon instantiation