Module: machine_learning

Machine learning algorithms and related nodes.

The majority of these nodes are one-stop shops that can be inserted after data has been appropriately preprocessed and labeled, and output predictions, but others fulfill other special management roles. The most important specialty nodes are Assign Targets, Accumulate Calibration Data, Measure Loss, and Crossvalidation. The most important classic ML techniques are Logistic Regression, Linear Discriminant Analysis and Convex Model, and some of the Classification and Regression nodes for categorical or continuous-valued outputs, respectively. Machine learning methods typically expect their inbound chunks to have an instance axis. Most existing nodes implement supervised machine learning algorithms: these nodes will only output predictions after they have been trained. To train a machine learning node, it needs to receive a packet that has both data and target labels. A node will usually output the predictions for every chunk that it receives once trained, including for the initial training chunk. Also, note that most of these nodes, with few exceptions, will only adapt themselves on non-streaming chunks (flag is_streaming is not set to True). Some nodes in here provide important related functionality, such as nodes for annotation with target labels and and management of training/calibration data, as well as cross-validation, which is an essential validation tool for machine learning.

AccumulateCalibrationData

Accumulate calibration data and emit it in one large chunk.

This node is for setups where you have streaming data and some of your processing nodes need to be calibrated, and you intend to collect the calibration data on the fly at the beginning of the live session, instead of using a recording of previously collected calibration data (in the latter case you would use the Inject Calibration Data node instead of this node). If all your processing nodes are capable of incremental calibration on streaming data, you do not need this node -- however, only very few nodes can do that: instead, most adaptive processing nodes require a long chunk that has all the calibration data, on which they then perform a one-step calibration calculation. This node handles the task of buffering up your streaming calibration data into one chunk, and then emits it all in one go, so that subsequent nodes can adapt themselves on it. For this to work, the accumulate node needs to know where the calibration data begin, and where they end: you are responsible for telling it by inserting special marker in the data at the beginning of the calibration period, and another special marker at the end (the marker strings to look for can be customized). This node also supports a few options only visible in expert mode to decide what happens to any streaming data prior to the beginning of the calibration section: usually you want to drop such data since your pipeline would not be able to process it yet anyway, but this can be overridden with a parameter. The other option is what happens when there is another calibration section some time after the first one: you may choose to either ignore that section, or to trigger another recalibration. Also, other than the mere length of the chunk, the way in which the calibration data that this node emits is distinguished from the regular streaming data which it also emits, is that this node marks the calibration chunk that it outputs as 'non-streaming' by setting the corresponding flag on the packet (since most adaptive nodes will only be able to update themselves on non-streaming data). Limitations: While it is possible to use this node to demarcate multiple successive calibration windows, and it will then emit a calibration chunk for each of them, this node basically assumes that a single input chunk is intersected by at most one such calibration time window. If that assumption is violated, this node will default to fusing the successive calibration windows into a longer one that covers all calibration data. Warnings will be generated in such cases.

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • begin_marker
    Marker string that indicates that calibration data is beginning. As explained in the help text for the node, a marker is necessary to inform the node about when the data that it should accumulate begins (and ends). The recommended way to get this marker into the data is by emitting it from the same program that also generates any other calibration-relevant markers (e.g., those that are picked up in the Assign Targets Markers node). Can also be a time offset, see Marker mode argument.

    • verbose name: Begin Marker
    • default value: calib-begin
    • port type: Port
    • value type: object (can be None)
  • end_marker
    Marker string that indicates that calibration data is ending. As for the begin marker, the best way to get it embedded into the data stream is to emit it from the program that manages the calibration process (that program would usually emit markers that are used e.g., by the Segmentation or the machine learning nodes). Can also be a time offset, see Marker mode argument.

    • verbose name: End Marker
    • default value: calib-end
    • port type: Port
    • value type: object (can be None)
  • marker_mode
    How to interpret the begin_marker and end_marker. In the default setting 'markers', the are matched to event markers that are assumed to be present in the data. In the 'relative-times' mode, they are interpreted as time offsets counting from the beginning of the data, in seconds.

    • verbose name: Marker Mode
    • default value: markers
    • port type: EnumPort
    • value type: str (can be None)
  • print_markers
    Print markers. This prints markers during the calibration period for debugging/inspection.

    • verbose name: Print Markers
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbose
    Verbose output.

    • verbose name: Verbose
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • emit_calib_data
    Emit the calibration data. If set to False, the calibration data portion is dropped.

    • verbose name: Emit Calibration Data
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • emit_predict_data
    Emit the non-calibration ('streaming') data. If set to False, the data outside the calibration markers will be dropped.

    • verbose name: Emit Streaming Data
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • calibration_first
    Do not emit any streaming data before the first calibration chunk has ended. This is needed for many methods that can only predict after they have been calibrated.

    • verbose name: No Output Before Calibration Data
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • can_recalibrate
    Allow re-calibration. If false, only a single calibration period will be allowed and subsequent calibration markers will be ignored.

    • verbose name: Allow Recalibration
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • suppress_empty_packets
    Do not emit packets that contain only empty data. In cases where this node would emit a packet with all- empty data, enabling this option will cause the node to emit None instead.

    • verbose name: Suppress Empty Packets
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

AssignTargets

Select which markers contain event-related signal activity, and optionally assign numeric target values to these markers for use in machine learning.

This node is part of a standard workflow for analysis of event-related signal activity, and optionally for machine learning on that activity. First of all, this node allows you to define which of the (possibly many) markers in the data should be used for event-related analysis. Second, you can assign different numeric values to different events (since each event has an associated string, you can give a mapping that assigns different values to different strings). There is also an option to accept any marker strings that can be converted into numbers, and take these numbers as the target values (this is useful for the case where regression targets are being specified, instead of classification). Once your markers are annotated in this way in the data, subsequent nodes will act on that subset of markers (e.g., the Segmentation node will extract segments around only the target markers), and if you have assigned numeric target values to specific markers, any subsequent machine learning nodes will interpret these values as the desired output values (or "labels") that the machine learning node is supposed to predict whenever it sees data that looks like what it observed around those markers in its training data. Thus, a typical workflow is to have an Assign Target Markers node, followed by a Segmentation node, optionally followed by some segment processing, followed by a machine learning node; usually you also need a way to feed both training and test/live data into this chain of nodes, e.g., using the Inject Calibration Data node or the Accumulate Calibration Data node prior to the Assign Target Markers node. Tip: since this node only distinguishes markers based on exact string matching, you may need to preprocess your marker strings beforehand using other nodes. It can also be helpful to insert new markers in the data based on custom criteria using, e.g., the Insert Markers node, before applying this node.

Version 1.0.2

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • mapping
    Mapping of matching criteria to target values. Any instances that match a given criterion (e.g., marker name or name pattern) will be assigned the associated target value. The format of the criteria can be overridden by mapping_format. The mapping can be given either fully explicit as a dictionary of {crit1: target-value1, crit2: target-value2, ...}, or using the shorthand list notation [crit1, crit3, crit3], which is equivalent to {crit1:0, crit2:1, crit3:3, ...}. An unorderered set {crit1, crit2, crit3} can be given to simply set target values for any matching criterion to 1.

    • verbose name: Value Assignment
    • default value: {'M1': 0, 'M2': 1}
    • port type: Port
    • value type: object (can be None)
  • mapping_format
    Format of the criterion strings. If set to 'names', each instance (e.g., marker string) needs to match the provided string exactly. If set to 'wildcards', the criterion is a wildcard expression that may include * or ? characters. If set to 'conditions', the mapping can be a restricted Python expression that may refer to other instance fields (e.g., "Marker == 'left' and Duration > 4.0", provided that the the instances have fields named Marker and Duration). See NeuroPype's QueryGrammar for more details on the available functions. Also, in this mode the mapping targets are allowed to be strings, which are then evaluated as formulas (possibly dependent on other instance fields) to calculate the target value. The special format 'passthrough-numbers' ignores the mapping entirely, and simply converts the marker strings to numbers, and uses those as target values. The 'compat' format is primarily for backwards compatibility with the settings of some deprecated fields. It is recommended to instead always select the syntax that you're using explicitly.

    • verbose name: Mapping Format
    • default value: compat
    • port type: EnumPort
    • value type: str (can be None)
  • iv_column
    Choose which column of the instance axis data table to use for mapping, if mapping is 'names' or 'wildcards'. This will almost always be 'Marker' (the default).

    • verbose name: Default Condition Field
    • default value: Marker
    • port type: StringPort
    • value type: str (can be None)
  • is_categorical
    If set then the TargetValue column in the IV table will be marked as categorical.

    • verbose name: Is Categorical
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • also_legacy_output
    Also write the target values in the legacy location. The target values will also be written into the data tensor (block.data).

    • verbose name: Also Legacy Output
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • use_numbers
    Alternatively convert number strings to target values. If this is checked, the marker assignment is ignored, and the node will treat any marker string that can be converted to a number as a target marker, and use the corresponding number as the target value. This is useful when regression targets are encoded in marker strings.

    • verbose name: Use Numbers Instead (Regression)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • support_wildcards
    Support wildcard matching.

    • verbose name: Support Wildcards
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbose
    Enable verbose output.

    • verbose name: Verbose
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

BalanceClasses

Balance the per-class trial counts in the data.

A class here refers to a numerical group of trials/instances in some data (e.g., class 0 may be one set of experimental conditions, and class 1 may be another). The node ensures that the data contain equal proportions of trials across these classes, which is sometimes necessary to ensure that downstream statistics and/or machine learning are sound (e.g., not unfairly biased towards an over-represented class, or accurately quantifying things like the error rate of ML models when all classes are being equally likely to occur in the data). This node should be called after AssignTargets (or an equivalent node that associates instances with numeric classes), and can be used on either continuous or segmented data. Note that, if you use this node on continuous data, this node will drop all event markers that are not of one of the designated target classes (as set via e.g., AssignTargets). Typically that is all markers where the TargetValue is set to the special value nan.

Version 1.2.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • strategy
    Strategy to apply. Duplicate will only duplicate under-represented trials, and drop will only drop excess over-represented trials. Mixed sets the target number of trials to the mean of the class trial counts, and duplicates or drops trials for each class accordingly to reach it. With target, under-represented trials will be duplicated and over-represented trials will be dropped to meet the target trial count specified in the target_count property. Bypass causes the node to do nothing.

    • verbose name: Strategy
    • default value: duplicate
    • port type: EnumPort
    • value type: str (can be None)
  • max_factor
    Maximum factor by which to duplicate trials if the duplicate strategy is selected, or by which to reduce trials if the drop strategy is selected, after which the other class(es) or dropped or duplicated, respectively, until the classes are balanced. Ignored if None or 0 or the mixed strategy is used.

    • verbose name: Max Factor
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • target_count
    Target number of trials per class. This only applies if the strategy is set to target. In this case, all classes will be duplicated or dropped as needed in order to reach the target count per class. Ignored if None or 0.

    • verbose name: Target Count
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • field_name
    Name of field containing the classes to rebalance. In the output, values of this field will occur approximately equally in the given instances. Only change this from TargetValue if the named field has been added to the Instance axis upstream.

    • verbose name: Field Name
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • binning_field
    Optional name of field to bin on. This will, for each unique value in this field, perform the balancing within all instances where the field takes on the same value.

    • verbose name: Binning Field
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • randseed
    Optionally the random seed to use to get deterministic results.

    • verbose name: Randseed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • verbose
    Print info and warning messages. 0: no output; 1: print results only; 2: print errors/warnings; 3: print all.

    • verbose name: Verbose
    • default value: 3
    • port type: EnumPort
    • value type: str (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

BayesianRidgeRegression

Estimate a continuous output value from features using Bayesian Ridge Regression.

Bayesian Ridge regression is a an elegant Bayesian method to learn a linear mapping between input data and desired output values from training data, which is closely related to ridge regression. The main difference to ridge regression is that in ridge regression, there is a tunable parameter that controls how strongly the method should be regularized. This parameter controls effectively how flexible or complex the solution may be, to prevent over-fitting to random details of the data and therefore to improve the generalization to new data. In ridge regression, this parameter is tuned using cross-validation, that is, by empirical testing on held-out data. In the Bayesian variant, there is such a parameter as well, however, the optimal degree of regularization is estimated from the data as well in a theoretically clean and principled fashion. This method assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for discrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other linear methods. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 300
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • alpha_shape
    Alpha shape parameter. This is only included for completeness and usually does not have to be adjusted. This is the shape parameter for the Gamma distribution prior over the alpha parameter. By default, this is an uninformative prior.

    • verbose name: Alpha Shape Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • alpha_rate
    Alpha rate parameter. This is only included for completeness and usually does not have to be adjusted. This is the rate parameter for the Gamma distribution prior over the alpha parameter. By default, this is an uninformative prior.

    • verbose name: Alpha Rate Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • lambda_shape
    Lambda shape parameter. This is only included for completeness and usually does not have to be adjusted. This is the shape parameter for the Gamma distribution prior over the lambda parameter. By default, this is an uninformative prior.

    • verbose name: Lambda Shape Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • lambda_rate
    Lambda rate parameter. This is only included for completeness and usually does not have to be adjusted. This is the rate parameter for the Gamma distribution prior over the lambda parameter. By default, this is an uninformative prior.

    • verbose name: Lambda Rate Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

ClassifierThresholdTuning

Tune the classification threshold of a predictive model so that the resulting class labels maximise a user-defined performance metric.

This can be used to, for example, balance sensitivity and specificity of a binary classifier as needed. The node performs an internal cross-validation, evaluates a series of candidate thresholds and selects the one that gives the highest score. It can be used with any pipeline that outputs either class probabilities or decision scores. Note that this node will not produce its outputs in the form of probabilities but rather outputs the hard class labels.

More Info...

Version 0.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • method
    Base classifier whose decision threshold is to be tuned.

    • verbose name: Method
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • method__signature
    Argument names of the pipeline being tuned. The wired-in graph must start with a Placeholder (slotname usually data) and end with a node producing per-sample decision scores or probabilities. The pipeline is instantiated afresh inside each cross-validation split to avoid train / test leakage.

    • verbose name: Method [Signature]
    • default value: (data)
    • port type: Port
    • value type: object (can be None)
  • preds_group_field
    Method groups predictions by the specified field. This is rarely used, but can be employed if the method used emits predictions not per input trial but per each member of a group (e.g. a session or subject).

    • verbose name: Grouping Field (Predictions)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • enabled
    Whether to enable this node. If this is unset, the node will have essentially no effect.

    • verbose name: Enabled
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • scoring
    Performance metric that the node will maximise when choosing the threshold (e.g . balanced_accuracy, f1, roc_auc). The choice of metric determines how the trade-off between different types of classification errors is handled when selecting the optimal decision threshold from the classifier's continuous output scores (or probabilities). Available options: - 'accuracy': Standard accuracy. Maximizes the overall percentage of correct predictions. Can be misleading on imbalanced datasets, as it might favor thresholds that simply predict the majority class frequently. - 'balanced_accuracy': Accuracy adjusted for class imbalance. Averages the recall (sensitivity) for each class. Favors thresholds that perform equally well across all classes, regardless of their frequency. - 'top_k_accuracy': Checks if the true label is among the top k highest-scored predictions. Less common for simple threshold tuning which usually relies on a single score per instance. - 'average_precision': Summarizes the precision-recall curve, focusing on the positive class. Favors thresholds achieving high precision (few false positives) across various recall levels. Good for rare positive classes. - 'f1': Harmonic mean of precision and recall for the positive class. Favors thresholds that balance finding positive samples (recall) and the accuracy of positive predictions (precision). - 'f1_macro': Unweighted average of the F1 score across all classes. Favors thresholds balancing precision/recall equally for all classes. - 'f1_weighted': Average F1 score across classes, weighted by class frequency. Favors thresholds balancing precision/recall, giving more weight to performance on common classes. - 'precision': Fraction of predicted positives that are actually positive (TP / (TP + FP)). Favors higher thresholds, making the classifier more conservative about predicting positive to minimize false alarms. - 'recall': Fraction of actual positives correctly identified (TP / (TP + FN)). Also known as Sensitivity or True Positive Rate. Favors lower thresholds to minimize missed positive cases (false negatives). - 'roc_auc': Area Under the Receiver Operating Characteristic curve (plots Recall vs. False Positive Rate). Measures overall discrimination ability across all thresholds. When used for selection, it favors a threshold corresponding to a good balance between high recall and low false positive rate. - 'roc_auc_ovr': For multi-class, computes AUC for each class vs. the rest, then averages (unweighted). Favors thresholds balancing TPR/FPR well on average for each class. - 'roc_auc_ovo': For multi-class, computes AUC for each pair of classes, then averages (unweighted). Favors thresholds discriminating well between pairs of classes. - 'roc_auc_ovr_weighted': Like 'roc_auc_ovr', but weighted by class frequency. - 'roc_auc_ovo_weighted': Like 'roc_auc_ovo', but weighted by class frequency. - 'gmean_sens_spec': Geometric mean of Sensitivity (Recall) and Specificity (True Negative Rate). Favors thresholds that balance the correct identification of both positive and negative classes, useful for imbalanced data.

    • verbose name: Scoring
    • default value: balanced_accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • thresholds
    Either the number of candidate thresholds to test or an explicit list of thresholds. If an integer is given the node generates this many equally spaced thresholds covering the observed score / probability range.

    • verbose name: Threshold Grid
    • default value: 100
    • port type: Port
    • value type: object (can be None)
  • refit
    Re-train the underlying model on the full data after the optimal threshold has been selected.

    • verbose name: Refit
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • num_folds
    Number of cross-validation folds. Cross-validation proceeds by splitting the data into num_folds blocks, iteratively holding out each block as validation data while training on the remaining blocks. The total runtime is therefore proportional to this value. • Set to 0 to disable CV and tune on the entire dataset (not recommended – prone to over-fitting). Set to 1 for leave-one-out CV (can be slow).

    • verbose name: Number Of Cv Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optional grouping field to ensure that samples from the same group do not appear in both train and validation sets (e.g . SubjectID, SessionID).

    • verbose name: Grouping Field (Cv)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Perform stratified splits so that the class proportions are similar across folds. Recommended when classes are imbalanced.

    • verbose name: Stratified Cv
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • cv_randomized
    Use randomized cross-validation.

    • verbose name: Randomized Cv
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • n_jobs
    Number of processor cores to use during the tuning process (-1 means all available cores).

    • verbose name: Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • random_seed
    Random seed controlling the CV shuffling and threshold sampling.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • store_cv_results
    If True the node records every tested threshold and its corresponding score. Useful for diagnostics but increases memory usage.

    • verbose name: Store Cv Results
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • cond_field
    Name of the instance-level data field that contains the class labels. Ignored if a DescribeStatisticalDesign node already processed the packet.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • out_format
    Output format of the predictions. If set to 'classes', the node will output a single feature that has the class index of the predicted class for each instance. If set to 'pseudo-probabilities', the node will output two features that contain the pseudo-probabilities of the predicted class and the other class; these will always be 0 or 1, depending on the predicted class. This can be useful for inserting this node into an existing pipeline that expects this format.

    • verbose name: Output Format
    • default value: classes
    • port type: EnumPort
    • value type: str (can be None)
  • initialize_once
    If False the node will re-tune itself whenever a new non-streaming data chunk with targets arrives.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    If True the model persists even if upstream parameters change. Use with caution – the model might become incompatible with the new data format.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

ClearTargets

Clear the target information from the markers in the given data.

This node undoes the operation of the Assign Target Markers node.

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • subset
    Indices of previously assigned markers to clear target value from. Only non-nan markers are counted in the indexing. This may be a convenient way to eliminate bad trials.

    • verbose name: Subset
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

ConvexModel

A flexible convex optimization based machine learning model.

This node is configured by specifying a smooth training cost function of parameters w (weights), optionally b (a bias or intercept), and the provided input data D, by wiring a graph with these placeholders into the cost port. For non-smooth costs, see below for more details. In either case, D may be any numeric data structure (typically a Packet or list thereof), and the cost may use any formula that is differentiable with respect to w and b, and may include smooth regularization terms (e.g., l2 norms). b is purely a convenience argument, and can be omitted if not needed. The w parameter can also have any structure (packets/arrays, lists/dictionaries thereof), and in general you can wire in an initial value (e.g., a zero-initialized array via the weights port). If an initial value is not provided, this defaults to a single zero-initialized training instance, which is often sufficient (more precisely, it defaults to the same data structure as your training data, but with any instance axis dropped and only a single zero- initialized instance retained). For the node to be able to to make predictions on new data, a prediction function (a mapping of weights, optional bias, and data) needs to be defined, and you can wire in a graph with the appropriate placeholders for this into the "pred" input. However, oftentimes, this is redundant and will have the same form as what is wired as predictions into the "Loss" node in your cost function (e.g., SigmoidBinaryCrossEntropyLoss), with the loss replaced by the appropriate link function (e.g., Sigmoid, Softmax, or Sign for classification, or no link for regression). If this is so, and you are using a Loss node in your cost function, you can leave the pred input unspecified, in which case it will be inferred to be the subgraph of your cost as just described. For non-smooth optimization, you may provide additional non-smooth cost-function terms in the form of one or more graphs with placeholders w and step_size, which may implement or invoke proximal operators corresponding to the desired penalties (e.g., the Sparse Penalty node to impose a sparsity-promoting l1 norm on the weights, etc.) - in most cases, this is simply a graph where those two placeholders are wired into the data and step_size inputs of one of the chosen Penalty node, which in turn is wired into one of the prox1..N ports. Additionally or alternatively you can also specify a graph invoking one or more constraint projections (e.g., using the Non-negative Constraint node) as a function of the weights into the constraint port. If you wish to specify multiple constraints and/or proximal operators, the following consideration applies: this node may not offer a solver that natively supports more than one such term, in which case a "meta-term" is constructed that is simply the successive application of your prox operators followed by the constraint; this is benign if the operators act on separate parts of your weights, and under some conditions this will also still converge if your operators overlap on some weights, but in general you will get a warning. If you know what you are doing, you can also simply chain or otherwise compose multiple proximal operators in a single graph and wire that into the prox1 port, and thereby avoid the warning. The node will by default use a smooth solver if there are no non-smooth terms, and otherwise default to the proximal gradient descent (PGD) algorithm to minimize the cost function, but other solvers could be used as well. Once weights have been optimized, the node can be used to make predictions on new data (where it will generally match the data types of the input data, e.g., Packet or a list thereof). Note also that you can configure what data the node should train on using the train_on property. For how to pass training data into the node, see documentation in the other machine learning nodes, which are somewhat more beginner-oriented than this node.

More Info...

Version 0.3.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: AnyNumeric (can be None)
    • data direction: INOUT
  • weights
    Initial/final weights. If not set, will be initialized to a packet equivalent to a single all-zeroes training-data instance.

    • verbose name: Weights
    • default value: None
    • port type: DataPort
    • value type: AnyNumeric (can be None)
    • data direction: INOUT
  • cost
    Smooth part of training cost function.

    • verbose name: Smooth Cost
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • cost__signature
    Argument names of the smooth terms of the training cost function. The first argument is a data structure of weights (usually an array), the second is an optional bias or intercept term (usually a scalar), and the third is the input data (basically what is passed in as the data to the ConvexModel node). Your cost function is then built as follows: you start with one Placeholder node for each of the listed names (whose set slotname must match the respective name); dependent on these placeholders you build out whatever expression represents the smooth component of your cost function. In formulating the cost, it is recommended to use one of the predefined Loss nodes (except MeaureLoss, which is not allowed in this context) to evaluate a loss function in terms of the input data (D). This allows ConvexModel to reuse a portion of your graph as the prediction function (namely the exact portion that is wired into the predictions port of the Loss node). This is however optional and you can always specify the prediction function separately yourself. To then pass your cost function to the ConvexModel node, you wire its final node (which outputs the scalar cost given the arguments) into the "cost" input port of ConvexModel. In graphical UIs, the edge connecting your cost function to the ConvexModel node will be drawn in dotted style to indicate that this is not a normal forward data flow but that a graph (i.e., your cost function) is running under the control of the ConvexModel node. The latter will, among others, take derivatives of your cost in order to optimize the weights. Your graph may also contain additional placeholders for hyper-parameters with names of your choosing (e.g., alpha or gamma); the values for such parameters must then be specified via the hyper_params dictionary input of the ConvexModel node; such hyper-parameters can be used to govern the strength of a smooth regularization term, for example the squared l2 norm of the weights or a Tikhonov operator.

    • verbose name: Smooth Cost [Signature]
    • default value: (w,b,D)
    • port type: Port
    • value type: object (can be None)
  • pred
    Optional prediction function. If not set, will be initialized to a GLM-type prediction function.

    • verbose name: Prediction Function
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • pred__signature
    Argument names of an optional prediction function graph. The arguments are the same as for the cost graph, except that the prediction function is only evaluated at prediction time (after optimization is done), and that the expected output is an array (or packet) of predictions for each observation (instance) in the data. If you do not specify a prediction function, the node will attempt to infer one from the cost function, by taking the subgraph of the cost function that wired into a Loss node (e.g., SquaredLoss), if present, and replacing the loss with the appropriate link function (e.g., Sigmoid, Softmax, or Sign for classification, or no link for regression). Note that this will only work if the cost function actually uses such a Loss node; otherwise you will have to wire in a graph that represents the prediction function here, following a recipe analogous to the one described in the previous sentence.

    • verbose name: Prediction Function [Signature]
    • default value: (w,b,D)
    • port type: Port
    • value type: object (can be None)
  • prox1
    Optional proximal operator.

    • verbose name: Prox1
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • prox1__signature
    Arguments to first proximal operator. Similarly to the cost function, this is an optional graph starting with some Placeholder nodes (in this case one with slotname set to w and another with slotname set to step), and followed by one or more nodes that implement the operation of a proximal operator applied to w with step size step. The output of this operation is then wired into the "prox1" port of the ConvexModel node. Proximal operators represent non-smooth penalties applied to the weights, and the easiest way to specify these operators is to use one of the Penalty nodes in NeuroPype, which implement a wide range of (flexibly configurable) proximal operators, and which have a "data" input (into which the "w" Placeholder is wired) and a "step_size" input (into which the "step" Placeholder is to be wired). Your graph may also contain additional placeholders for hyper-parameters with names of your choosing (e.g., alpha or gamma); the values for such parameters must then be specified via the hyper_params dictionary input of the ConvexModel node.

    • verbose name: Prox1 [Signature]
    • default value: (w,step)
    • port type: Port
    • value type: object (can be None)
  • prox2
    Optional second proximal operator.

    • verbose name: Prox2
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • prox2__signature
    Arguments to the second proximal operator, if any. See documentation of prox1 for more details. Note that most of the available solvers do not natively support more than one proximal operator. In such cases, the operators will be applied in a round-robin fashion (i.e., the first operator is applied to the weights, then the second operator is applied to the result, and so on), which is a sensible strategy if the operators are in some sense orthogonal, for example by acting on different parts of the weights or orthogonal projections of the weights; if the operators are in "tension", the result can be suboptimal, and you may need to implement a single proximal operator that combines the effects of the individual operators, for example by alternating between them in a loop.

    • verbose name: Prox2 [Signature]
    • default value: (w,step)
    • port type: Port
    • value type: object (can be None)
  • prox3
    Optional third proximal operator.

    • verbose name: Prox3
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • prox3__signature
    Arguments to the third proximal operator, if any. See documentation of prox1 for the general structure and prox2 for more details on having more than one prox operator.

    • verbose name: Prox3 [Signature]
    • default value: (w,step)
    • port type: Port
    • value type: object (can be None)
  • constraint
    Optional constraint(s).

    • verbose name: Constraint
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • constraint__signature
    Arguments to constraint projection(s). A constraint projection is a simple graph of (typically) a single placeholder, with slotname set by convention to w, followed by some operation that constrains w to a convex set (e.g., the set of non-negative numbers). NeuroPype ships with a number of such nodes, which generally end in Constraint, and which implement a number of customary convex contraints. The output of the constraint projection is then wired into the "constraint" port of the ConvexModel node.

    • verbose name: Constraint [Signature]
    • default value: (w)
    • port type: Port
    • value type: object (can be None)
  • hyper_params
    Hyper-parameters for the custom wired-in graphs. This is a dictionary of arbitrary key-value pairs that can be used to configure the cost function and proximal operators. The respective graphs may then declare and use placeholders named the same as in the dictionary keys.

    • verbose name: Hyper Params
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • max_iter
    Maximum number of iterations.

    • verbose name: Max Iter
    • default value: 500
    • port type: IntPort
    • value type: int (can be None)
  • abstol
    Absolute convergence tolerance. If weights change less than this (after normalization by step size), the optimization terminates. Note that this depends on the data scale.

    • verbose name: Abstol
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • hessian_rank
    Maximum rank of the Hessian approximation, if using a quasi-Newton (L-BFGS) method. This is typically only available if none of the prox and constraint argument are specified, and is ignored otherwise. The value is a trade-off between accuracy, memory usage, and performance, and a typical values are 6-10.

    • verbose name: Hessian Rank
    • default value: 10
    • port type: IntPort
    • value type: int (can be None)
  • stepsize
    Optional step size. If unspecified, the step size is adapted automatically using a line search.

    • verbose name: Stepsize
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • max_backtrack
    Maximum number of line search steps per iteration. A typical value is 15, but some solvers, such as L-BFGS can benefit from as many as 30. Only used if step size is left to automatic.

    • verbose name: Max Backtrack
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • backtrack_factor
    Backtracking line search factor. Only used if stepsize is 0. The default depends on the chosen algorithm, and is 0.5 for PGD and 0.8 for Jaxopt-based LBFGS.

    • verbose name: Backtrack Factor
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • increase_factor
    Line search increase factor. Only used if stepsize is 0. The default depends on the chosen algorithm, and is 1.5 (jaxopt) or 2.0 (optax) depending on the LBFGS variant.

    • verbose name: Increase Factor
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • use_jit
    If enabled, attempt to use JIT compilation for the inner loop. This incurs a one-time compilation cost, but the actual solving will be greatly accelerated if using the GPU.

    • verbose name: Use Jit
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • unroll
    Whether to unroll the optimization loop. Not supported by all solvers; this could be useful for very rough solves that are set to terminate after just a few iterations but is otherwise usually not recommended.

    • verbose name: Unroll
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • solver
    Solver to use. PGD is the basic (unaccelerated) proximal gradient descent and APGD is the Nesterov accelerated version. Both are first-order methods and require the cost function to be differentiable (smooth). APGD typically converges in fewer iterations than PGD at a modest per-iteration overhead. LBFGS is an efficient quasi-Newton method that can be used with twice differentiable cost functions, but supports neither proximal operators nor constraints. The jaxopt variants of these solvers are the legacy implementations that use the (now deprecated) jaxopt package; in the long term, these will be phased out.

    • verbose name: Solver To Use
    • default value: auto
    • port type: ComboPort
    • value type: str (can be None)
  • linesearch_type
    Type of line search to use (if stepsize is not given). Note that the set of implemented line-search modes may depend on the solver and is subject to change.

    • verbose name: Linesearch Type
    • default value: auto
    • port type: ComboPort
    • value type: str (can be None)
  • verbosity
    Verbosity level. 0: no output, 1: per-iteration summary. Note that JIT will be disabled if verbosity is used.

    • verbose name: Verbosity
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • canonicalize_output_axes
    Whether to canonicalize the output axes of the model to match the expected output axes of other machine-learning nodes. This can be turned off if your model emits a handcrafted feature or statistic axis to describe its predictions that you'd like to retain. Note though that some downstream nodes, like MeasureLoss might not work as expected. If your model is highly custom, you may be required to do this step explicitly in your prediction function.

    • verbose name: Canonicalize Output Axes
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • train_on
    Update the model on the specified data. Note that generally the model will output predictions on any data that it receives whether it is training on it or not. In the "initial offline" mode, and only if the model is not already trained, the model is trained on the first non-streaming packet that it receives, which should thus be the training set, while subsequent data are test data. In the "successive offline" mode, the model is trained on any non-streaming packet that it receives, whether it is already pretrained or not (in such case it will be further fine-tuned given the new training data). In both scenarios, ny streaming data is taken as test-only data; this is a typical scenario for real-time processing when the model is first trained or fine-tuned on some pre-recorded data, but it can also be used to just train a model on multiple successive datasets. The last mode, "offline and streaming" will train on any data that has label information, whether it is offline or streaming. This can be used for either real-time training or fine-tuning, i.e., while data is being collected. Note however that this is dangerous if you also intend to test performance on streaming data -- in such case the model will update (i.e. train) on your test data, unless the test labels are withheld from the node.

    • verbose name: Train On
    • default value: initial offline
    • port type: EnumPort
    • value type: str (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • no_compile_in_debug
    Do not compile the model when running in debug mode.

    • verbose name: No Compile In Debug
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

CovarianceMDM

Classify covariance matrices based on a Riemannian minimum-distance-to-mean criterion.

This method assumes that the input features are not feature vectors but instead covariance matrices (i.e., as computed using one of the covariance estimation nodes). This node will find the average (centroid) of of trials in each class, and then classify new trials (each a covariance matrix) based on what class mean they are closest to. This distance metric is non-Euclidean and instead follows the Riemannian geometry of covariance matrices. One can optionally use robust estimation if the data is believed to be contaminated with outliers. The node can also be used to emit the raw distances.

Version 0.5.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • output
    output format to use. If using probabilities, this will attempt to estimate the probability that some given data belongs to each of the classes; these probabilities are very conservative. If using distances, this will output the raw distances to each class mean. If using class-labels, the most likely class labels of the trials will be returned.

    • verbose name: Output
    • default value: probabilities
    • port type: EnumPort
    • value type: str (can be None)
  • robust
    Use robust estimator for class means.

    • verbose name: Robust
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • max_iter
    Max number of iterations for mean estimate. This value serves as an additional stopping criterion for the learning algorithm, and can be used to ensure that the method runs in a fixed time budget. This rarely has to be tuned.

    • verbose name: Max Number Of Iterations
    • default value: 50
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance for mean estimate. A lower tolerance will lead to longer running times and can result in more accurate solutions -- however, note that the actual difference in the outputs will be minimal at best, unless a very coarse tolerance is used.

    • verbose name: Convergence Tolerance
    • default value: 1e-05
    • port type: FloatPort
    • value type: float (can be None)
  • falloff_distribution
    Probability distribution to model falloff, for outputting probabilities. Norm is the normal (Gaussian) distribution, cauchy is the Cauchy distribution, and gennorm is the generalized normal distribution (Version 1). The choice of distribution will not affect the ranking of emitted class probabilities, but their confidence.

    • verbose name: Falloff Distribution
    • default value: gennorm
    • port type: EnumPort
    • value type: str (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

Crossvalidation

Perform a cross-validation and return a loss measure that quantifies how well the pipeline performed when generalizing to previously unseen data.

This node receives an evaluation data set (packet , or list of packets for multi-session data) and a reference to a processing pipeline. The pipeline is typically a chain of nodes that starts with a Placeholder node (whose slotname is set to "data"), and whose final node's "this" output is wired into the cross-validation's "method" input. This means that the the pipeline is a graph with one input (the data) and it's passed to the cross-validation node (you can learn more about this way of passing graphs in NeuroPype's docs on graph-accepting nodes). The pipeline may also accept a second placeholder, which is expected to be a boolean representing whether the pipeline is used for training or testing (i.e., an is_training flag). While used with a cross-validation, the pipeline will "experience" being called more than once -- the first time, it receives training data (corresponding to the training portion of a fold), and from then on it will receive only test data (corresponding to the test portion of a fold). The pipeline is expected to adapt itself only on the first call (this is accomplished by ensuring the initialize_once flag is set in all adaptive nodes, which is the default for all machine-learning nodes, but it is not necessarily the default for all otherwise adaptive nodes that may occur in the pipeline, such as statistics (e.g., Standardization) or adaptive feature extraction, so you may have to set this flag there. The reason the pipeline only "experiences" training data once is that the pipeline will be discarded after each fold is complete, and a fresh copy of the untrained pipeline is used with the next fold (this is to ensure that there can be no train-test leakage). The pipeline may generally output results in one of three permitted forms: a) the predictions on the input data it received in some format (typically a Packet, possibly an array), b) a two-element list where the first entry is the aforementioned predictions, and the second entry is a data structure representing a model (e.g., this can be the "model" output of the ML node, if any, or a combination of model outputs of the ML node and any preceding adaptive nodes such as Standardization) and c) the pipeline may also return a new graph that represents the trained pipeline for subsequent use with test data. Optionally, you can also specify a custom scoring function to the Crossvalidation node, the simplest of which is a Placeholder named pred followed by MeasureLoss, wired into the "scoring" port. The cross-validation will partition the dataset into non-overlapping training and test subsets, and repeatedly invoke the pipeline first on a training subset, then on a corresponding test subset to obtain out-of-sample predictions. The predictions are then compared to the true labels by the scoring function, and one or more specified loss metrics are computed, and returned via the "loss" output. The loss is either a simple float, or can also be a packet with a statistic axis that reports a number of stats besides the mean value, depending on the selected loss_format. The node supports the majority of standard cross-validation schemes, and defaults to a sane 5-fold blockwise (non-randomized) CV, which is appropriate for time-series data. You can also specify a grouping structure (which is implicit if you are passing a list of packets) to perform grouped CV (where whole held-out groups of trials appear in the test set). If your pipeline includes adaptive signal processing (e.g., whitening, PCA, ICA, etc), you will need to pass continuous data with event markers into the cross-validation node, and include those adaptive steps in your piepline. The node can run computations for multiple folds in parallel, if you specify num_procs as either None (default all cores) or some number >1. You may also experiment with the number of threads that each process may use (num_threads_per_proc), since numpy can easily cause excessive thread churn (100% utilization). If you have multiple GPUs and your pipeline uses one or more GPU backends (e.g., jax or torch), then the node can spread the folds out over the GPUs, depending on num_procs_per_gpu and compute_backends. See the "return trained model" flag for how to obtain a model trained on the whole data from the cross-validation.

More Info...

Version 2.2.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data for evaluation.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: object (can be None)
    • data direction: IN
  • method
    Pipeline to evaluate.

    • verbose name: Method
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • method__signature
    Argument names of the pipeline being cross-validated. Your pipeline is a subgraph that must contain at least one Placeholder node whose slotname must match the argument name listed here. The placeholder then acts as the entry point for any data that is passed into the pipeline when it is invoked by the cross-validation. Your pipeline's final node (which typically produces the predictions) is then wired to the cross-validation's "method" input port. In graphical UIs, this edge will be displayed in dotted mode to indicate that this is not normal forward data flow, but that a subgraph (your pipeline) is being passed to the cross-validation node as a whole, which will then repeatedly invoke your pipeline graph. In summary, your pipeline starts with a Placeholder that is followed by some processing nodes (in the simplest case just a single machine-learning node, such as Linear Discriminant Analysis). The final node of your pipeline is the one whose outputs are taken to be the pipeline's predictions, and this node is wired into the "method" input of the Cross-validation. Any "loose ends" downstream of your placeholder are also considered to be part of the pipeline but do not contribute to the result (they may be used for other purposes, such as printing progress information). Your pipeline may optionally have a second placeholder, which should by convention have slotname set to is_training, and then is_training must be listed as the second argument here. This second placeholder is used to indicate whether your pipeline is currently being called on training data or test data. Regardless of whether you expose this parameter or not, the way your pipeline is executed by the cross-validation is as follows. For each fold in the cross-validation, your pipeline graph is instantiated from its default (uninitialized) state, and is then called with the training set of that fold. Then, the same graph is called again, but this time with the test set of that fold; it is then up to any adaptive nodes in your pipeline (e.g., machine learning nodes) to adapt themselves on the first call and to make predictions (usually without adapting again) on the second call. The pipeline is discarded after each fold and a new pipeline graph is instantiated (to avoid any unintended train/test leakage). Your pipeline MAY also return as its final output a two-element list whose first element is the predictions and whose second element is a model data structure (by convention usually a dictionary). If the cross-validation node has the return_model option enabled, then this output will be used to populate the "model" output port of the cross-validation node, and the "trained" output port will remain empty (for this your pipeline will be run once on the entire dataset, with no second invocation since there is no more test data).

    • verbose name: Method [Signature]
    • default value: (data)
    • port type: Port
    • value type: object (can be None)
  • scoring
    Optional custom scoring rule.

    • verbose name: Scoring
    • default value: None
    • port type: GraphPort
    • value type: Graph (can be None)
  • scoring__signature
    List of arguments of an optional scoring rule graph. This is an optional graph that implements the scoring (evaluation) rule of the CrossValidation node. If no graph is provided, the default behavior is equivalent to having wired in a graph here that consists of a Placeholder node (with slotname set to pred) connected to a MeasureLoss node (configured in accordance with the loss_metrics option), which is then wired into the "scoring" input of the cross-validation node. To override this default, you instead wire in your own graph, starting with a placeholder and followed by some computation that outputs a performance measure. This graph may also have an additional placeholder named verbose, which the Crossvalidation will use to print a desired subset of results.

    • verbose name: Scoring [Signature]
    • default value: (pred)
    • port type: Port
    • value type: object (can be None)
  • loss
    Aggregate loss measure (or multiple measures).

    • verbose name: Loss
    • default value: None
    • port type: DataPort
    • value type: AnyNumeric (can be None)
    • data direction: OUT
  • loss_std
    Loss standard deviation across CV splits. This is only set if loss_format is 'plain'; otherwise it will be reported in the axes of the loss output.

    • verbose name: Loss Std
    • default value: None
    • port type: DataPort
    • value type: AnyNumeric (can be None)
    • data direction: OUT
  • losses
    Per-fold loss measures across CV folds.

    • verbose name: Losses
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • predictions
    Concatenated per-sample predictions across all CV folds.

    • verbose name: Predictions
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • trained
    Trained pipeline.

    • verbose name: Trained
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • loss_metrics
    Loss metrics. This is used if no custom scoring function was provided. The following measures can be calculated both offline and online: MCR is mis-classification rate (aka error rate), MSE is mean-squared error, MAE is mean absolute error, Sign is the sign error, Bias is the prediction bias. The following measures are currently only available during offline computations: SMAE is the standardized mean absolute error, SMSE is the standardized mean-squared error, max is the maximum error, RMS is the root mean squared error, MedSE is the median squared error, MedAE is the median absolute error, SMedAE is the standardized median absolute error, AUC is negative area under ROC curve, R2 is the R-squared loss, CrossEnt is the cross-entropy loss, ExpVar is the negative explained variance.

    • verbose name: Output Metrics
    • default value: ['MCR']
    • port type: SubsetPort
    • value type: list (can be None)
  • loss_format
    Format of the loss output packet. If set to stats-axis, a statistics axis and a feature axis (one entry per measure) will be used. If set to 2-feature-axes, instead of the stats axis, a second feature axis will be created. If set to plain, loss have the mean loss and loss_std the standard deviation, and in case of legacy, they will be floats if only a single metric was selected.

    • verbose name: Output Format
    • default value: legacy
    • port type: EnumPort
    • value type: str (can be None)
  • folds
    Number of cross-validation folds. If >1 this is k-fold CV. If set to 1, this is leave-one-out CV. If folds is between 0 and 1, this is p-holdout CV (p being the percentage to hold out). If set to 0, the training-set error is computed. If a negative integer, this is p-holdout CV (p being the number of samples to hold out, where all permutations are generated; one may optionally set repeats to 0 to hold out exactly the last |p| sessions once).

    • verbose name: Cross-Validation Folds
    • default value: 5
    • port type: FloatPort
    • value type: float (can be None)
  • randomized
    Whether to perform randomized or blockwise cross-validation. Blockwise is preferred for data stemming from a time series.

    • verbose name: Randomized
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • stratified
    Use stratified CV. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • repeats
    Number of repetitions. This is only useful in case of randomized CV.

    • verbose name: Number Of Repetitions
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Group Field
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cond_field
    The name of the instance data field that contains the conditions (classes) to be discriminated. This parameter will be ignored if the input data packet has previously been processed by a BakeDesignMatrix node.

    • verbose name: Condition Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • censor_labels
    Whether to censor labels from the model at prediction time as a safeguard against data leakage. Note that this feature requires the input and output data structures of the model to be identical at prediction time, and the input data cannot have multiple sets of labels in different parts of the data structure. Setting this to False does not imply that data leakage will happen (it should not in any case); this simply provides an additional safeguard. One case where you will need to send this to False is if, during development, you want to be able to debug test-time predictions inside the model (i.e., by setting a breakpoint), so that you can compare the predictions with their respective labels. Note that this setting only supports already segmented data (i.e., trials with labels), not continuous data with segmentation happening inside the cross validation method.

    • verbose name: Censor Labels
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • return_model
    Whether to also compute a model trained on the entire dataset. This will do one additional fold where the training data is the whole dataset and there is no test data (this can be done in parallel to the other fold if the number of processes setting is set to high enough). The cross-validation node will then populate the "trained" output port with a graph representing the trained pipeline, which can be subsequently used with e.g., the "Call" node to invoke it on test data. Alternatively, if the pipeline was configured to return a two-element list whose second element is a model data structure, then the model returned in this fashion on that (training-only) fold will be used to populate the "model" output port of the cross-validation node instead. In this case, the trained output will remain empty.

    • verbose name: Return Trained Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • trial_size
    Assumed size of a trial. Formatted as in the Segmentation node's time_bounds argument, and given in seconds relative to the respective markers. This is used to be able to perform continuous-time selection around markers. Should be large enough to encompass the segment size used in the Segmentation step of the given method, if any. You can also make it longer to retain more continuous data around the ends of your training or test data, e.g., to make available to continuous-time filters etc. If this is left unspecified, it will be inferred from the pipeline's Segmentation node.

    • verbose name: Segment Span
    • default value: []
    • port type: ListPort
    • value type: list (can be None)
  • exclude_margin
    Exclude any training trials that fall within this many seconds of test trials.

    • verbose name: Exclude Margin
    • default value: 5.0
    • port type: FloatPort
    • value type: float (can be None)
  • tight_selection
    If True, the selected stretches of continuous data (if any) will be tightly enclosing. Otherwise, the selection will run to the edges of the dataset where possible. This only applies if continuous (non-segmented) data is passed into the cross-validation node.

    • verbose name: Tight Selection
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • num_procs
    Number of processes to use for parallel computation. If None, the global setting NUM_PROC, which defaults to the number of CPUs on the system, will be used.

    • verbose name: Max Parallel Processes
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • num_threads_per_proc
    Number of threads to use for each process. This can be used to limit the number of threads by each process to mitigate potential churn.

    • verbose name: Threads Per Process
    • default value: 4
    • port type: IntPort
    • value type: int (can be None)
  • compute_backends
    GPU compute backends that may be used by the pipeline. If you include GPU compute backends here, workloads using those backends will be farmed out across multiple GPUs (if available) when running cross-validation folds in parallel. The 'auto' mode will attempt to auto-detect any backend settings in the given pipeline's nodes, but note that this will only catch nodes where this is explicit in the node's properties, and GPU workloads missed in this fashion will run by default on GPU 0.

    • verbose name: Compute Backends
    • default value: ['auto']
    • port type: SubsetPort
    • value type: list (can be None)
  • num_procs_per_gpu
    Number of processes to use per GPU. This is only relevant if you have GPU compute backends enabled. If your GPU(s) are under-utilized during cross-validation, you can increase this to run this many CV folds on each GPU.

    • verbose name: Processes Per Gpu
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • multiprocess_backend
    Backend to use for farming out computation across multiple (CPU) processes. Multiprocessing is the simple Python default, which is not a bad start. Nestable is a version of multiprocessing that allows your pipeline to itself use parallel computation. Loky is a fast and fairly stable backend, but it does not support nested parallelism and has different limitations than multiprocessing. It can be helpful to try either if you are running into an issue trying to run something in parallel. Serial means to not run things in parallel but instead in series (even if num_procs is >1), which can help with debugging. Threading uses Python threads in the same process, but this is not recommended for most use cases due to what is known as GIL contention.

    • verbose name: Multiprocess Backend
    • default value: loky
    • port type: EnumPort
    • value type: str (can be None)
  • serial_if_debugger
    If True, then if the Python debugger is detected, the node will run in serial mode, even if multiprocess_backend is set to something else. This is useful for debugging, since the debugger does not work well with parallel processes. This can be disabled if certain steps should nevertheless run in parallel (e.g., to reach a breakpoint more quickly).

    • verbose name: Serial If Debugger
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • clear_memory_per_fold
    If enabled, then the memory will be cleared after each fold. This is useful if you are running out of memory during cross-validation, but it will slow down the computation somewhat. The auto mode will attempt to auto-determine a sensible setting depending on the model used. The aggressive mode will force each task to run in a separate subprocess, and will reclaim the subprocess after the fold, which guarantees that memory will be reclaimed; this should be considered a last resort, since it will slow down the computation.

    • verbose name: Clear Memory Per Fold
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • ignore_bad_packets
    If input is given as a list of packets, ignore packets that are bad (e.g ., due to missing labels). This will print a warning; otherwise an exception will be raised.

    • verbose name: Ignore Bad Packets
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • trial_padding
    Padding around the specified trial length. This is to avoid cutting the signal time axis too short near the edge of segments. Note that for very low-sampling rate data, e.g., NIRS, you may need a larger margin.

    • verbose name: Trial Padding
    • default value: 0.3
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed (int or None). Different values will give somewhat different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • verbose
    Whether to print verbose output.

    • verbose name: Verbose
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • loss_metric
    Legacy loss metric. This has been superseded by loss_metrics, which can take multiple metrics.

    • verbose name: Loss Metric (Legacy)
    • default value: MCR
    • port type: EnumPort
    • value type: str (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

ElasticNetRegression

Estimate a continuous output value from features using linear regression with Elastic Net regularization.

Elastic net regression is a principled statistical technique to learn a linear mapping between input data and desired output values from training data. Elastic net combines the regularization terms of ridge regression and LASSO regression, which combines some of the benefits of both techniques. Most importantly, elastic net overcomes a shortcoming of LASSO in the presence of multiple highly correlated features, which is that such features will usually not receive similar weights, but instead often rather arbitrary relative weights. As such, elastic net is a preferable technique if sparse feature selection as provided by LASSO is desirable, but if features may also be correlated. Elastic net includes a tuning parameter that allows to blend between l1 (LASSO) and l2 (ridge) regularization terms, which is automatically tuned together with the overall regularization strength parameter using cross-validation. For this reason, this method can at the extremes imitate either of the two algorithms if that is beneficial given the data. Due to the need for tuning the additional parameter, the running time will however be significantly larger than using either method directly. If there are very few trials, or some extensive stretches of the data exhibit only one class, the procedure used to find the regularization parameters (cross-validation) can fail with an error that there were too few or no trials of a given class present. Elastic net regression assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for disrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other (sparse) linear methods. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_alphas
    Number of values in regularization strength search grid. This method determines the optimal regularization strength by testing a number of different strength values between the minimum and maximum value. The running time increases with a finer search grid, but the found solution may be slightly better regularized when using a fine grid (e.g., 100 values) instead of a coarse grid (e.g., 20 values).

    • verbose name: Number Of Values In Reg. Search Grid
    • default value: 100
    • port type: IntPort
    • value type: int (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training. Setting this to one will yield leave-one-out CV, and if a group field was specified, then leave-one-group-out CV.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • l1_ratio
    Tradeoff parameter between between l1 and l2 penalties. This parameter controls the balance between the l1 regularization term (as in LASSO) and the l2 regularization term (as in ridge regression). A value of 0 leads to exclusive use of l2, and a value of 1 leads to exlusively use of l1. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). In fact, for each setting of this parameter, the entire list of possible values for the regularization strength is tested, so the running is also proportional to the number of those values. The details of the parameter search can be controlled via the search metric and number of folds parameters. The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Regularization Type Tradeoff Parameter
    • default value: [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1]
    • port type: ListPort
    • value type: list (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • min_alpha
    Minimum regularization strength. This is expressed as a factor of the maximum regularization strength, which is calculated from the data. By default, the optimal strength will be searched between this value and the maximum.

    • verbose name: Minimum Regularization Strength
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • alphas
    Override regularization strength. Optionally the default regularization search grid can be overridden by giving an explicit list of values here, although this is rarely necessary or very helpful. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Override Regularization Strength
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • selection
    Parameter update schedule. Using random here can be significantly faster when higher tolerance values are used.

    • verbose name: Parameter Update Schedule
    • default value: cyclic
    • port type: EnumPort
    • value type: str (can be None)
  • positivity_constraint
    Constrain weights to be positive. This is a special (and rarely-used) flavor of this method, in which the learned weights are constrained to be positive.

    • verbose name: Constrain Weights To Be Positive
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • precompute
    Precompute shared data. Precompute some shared data that is reused during parameter search. Aside from 'auto', this can be set to True or False. Auto attempts to determine the best choice automatically.

    • verbose name: Precompute Shared Data
    • default value: auto
    • port type: Port
    • value type: object (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

EnsemblePredictor

A model that combines outputs from multiple models trained on subsets of the data.

The node accepts a wired-in base machine-learning pipeline (via the method input), which is instantiated multiple times and which is trained with different subsets of the data and/or random seeds. This node then combines the predictions of the submodels using the specified rule to form a consensus prediction. Such models tend to be less prone to overfitting the training data. Note that this node can be configured to reproduce both the strategy commonly known as bagging (when setting sampling with replacement to True, using a subset size of 1, and using the mean or voting integration rule), as well as the strategy known as pasting (when setting sampling with replacement to False, using a subset size of 0.5-0.8 and using either the mean or voting rule), as well as a robust approach wherein the median is used in conjunction with a subset size of 0.2-0.4 (this is conceptually similar to Theil-Sen estimation); however, note that this strategy is often outperformed by natively robust methods and is therefore mainly useful if there is no other robust option available. Since the computational cost is higher by a factor of the number of models, this node is often used only in the final stages of model development, when the underlying base model model is largely finalized; however, it can also be used during development if the base model is otherwise prone to overfitting or has robustness issues.

Version 0.7.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: AnyNumeric (can be None)
    • data direction: INOUT
  • method
    Underlying base method.

    • verbose name: Method
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • method__signature
    Argument names of an underlying base model. Your base model is a subgraph that must contain at least one Placeholder node whose slotname must match the argument name listed here. The placeholder then acts as the entry point for any data that is passed into the pipeline when it is invoked by the ensembling node. Your pipeline's final node (which typically produces the predictions) is then wired to the ensembling node's "method" input port. In graphical UIs, this edge will be displayed in dotted style to indicate that this is not normal forward data flow, but that a subgraph (your pipeline) runs under the control of the ensembling node. In summary, your pipeline starts with a Placeholder that is followed by some processing nodes (in the simplest case just a single machine-learning node, such as Linear Discriminant Analysis). The final node of your pipeline is the one whose outputs are taken to be the pipeline's predictions, and this node is wired into the "method" input of this node. As always in NeuroPype, any "loose ends" downstream of your placeholder are also considered to be part of the pipeline but do not contribute to the result (they may be used for other purposes, such as printing progress information). Your pipeline will generally be instantiated multiple times, and at training time, each instance will be given a different part of the data. Your pipeline may optionally have a additional placeholders, as follows: you may have an is_training placeholder (used to indicate whether your pipeline is currently being called on training data or test data), a random_seed placeholder (which receives a different random seed than the respective other models), and an index parameter (indicating the 0-based index of the current model).

    • verbose name: Method [Signature]
    • default value: (data)
    • port type: Port
    • value type: object (can be None)
  • enabled
    Whether to enable ensembling. If this is unset, the node will have essentially no effect (other than passing the configured random seed directly to the model, if the model has such a placeholder).

    • verbose name: Enabled
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • num_models
    Number of models to train.

    • verbose name: Num Models
    • default value: 10
    • port type: IntPort
    • value type: int (can be None)
  • data_subset
    Fraction of the data to use for training each model. Can also be given as a negative integer, in which case the value encodes the number of data points to hold out. The typical setup when using sampling without replacement is 50-80% of the data, which helps primarily with overfitting when used in conjunction with the mean or voting rule. It is possible to configure the approach to be additionally robust to a small number of outliers in the data by using the median integration rule and choosing a dataset subset that is strictly smaller than the breakdown point of the median estimator (i.e., less than 0.5), for example 0.33. This is the same idea also exploited in the Theil-Sen estimator (when used with random subsets).

    • verbose name: Data Subset
    • default value: 0.75
    • port type: FloatPort
    • value type: float (can be None)
  • rule
    The rule used to combine predictions across models. The mean is the typical approach for regression or probabilistic outputs, which helps with overfitting. The median is a robust alternative to the mean that is less affected by outliers, but note that you may need a low subset fraction to make this work well (e.g., 0.25-0.33). Voting predicts the most likely class label according to each submodel and then takes the majority vote; this approach is also fairly robust to outliers, but note that it erases the probabilistic information in the predictions.

    • verbose name: Rule
    • default value: mean
    • port type: EnumPort
    • value type: str (can be None)
  • with_replacement
    Whether to sample with replacement. If set, this implements bootstrap aggregation ("bagging"), and if unset, it implements a technique that has been called "pasting". Both techniques can have pros and cons, and bagging is more commonly used, but can cause problems with certain types of models, for example those that perform internal cross-validations (e.g., ParameterOptimization, ProbabilityCalibration) or ones that aggregate triasl within subject (e.g., TrialAggregatePredictor).

    • verbose name: Sample With Replacement
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • stratified
    Whether to stratify the sampling by class labels. This helps ensure that the proportion of class labels in each subset is representative of the proportion on the full dataset. This is only implemented for the case where each item (e.g., packet) has a single label. This can be useful for small datasets.

    • verbose name: Stratified
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • cond_field
    The name of the instance data field that contains the conditions (classes) to be discriminated; used for stratified sampling.

    • verbose name: Condition Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • num_procs
    Number of processes to use for parallel computation. If None, the global setting NUM_PROC, which defaults to the number of CPUs on the system, will be used.

    • verbose name: Max Parallel Processes
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • num_threads_per_proc
    Number of threads to use for each process. This can be used to limit the number of threads by each process to mitigate potential churn.

    • verbose name: Threads Per Process
    • default value: 4
    • port type: IntPort
    • value type: int (can be None)
  • compute_backends
    GPU compute backends that may be used by the pipeline. If you include GPU compute backends here, workloads using those backends will be farmed out across multiple GPUs (if available) when running cross-validation folds in parallel. The 'auto' mode will attempt to auto-detect any backend settings in the given pipeline's nodes, but note that this will only catch nodes where this is explicit in the node's properties, and GPU workloads missed in this fashion will run by default on GPU 0.

    • verbose name: Compute Backends
    • default value: ['auto']
    • port type: SubsetPort
    • value type: list (can be None)
  • num_procs_per_gpu
    Number of processes to use per GPU. This is only relevant if you have GPU compute backends enabled. If your GPU(s) are under-utilized during cross-validation, you can increase this to run this many models on each GPU.

    • verbose name: Processes Per Gpu
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • multiprocess_backend
    Backend to use for farming out computation across multiple (CPU) processes. Multiprocessing is the simple Python default, which is not a bad start. Nestable is a version of multiprocessing that allows your pipeline to itself use parallel computation. Loky is a fast and fairly stable backend, but it does not support nested parallelism and has different limitations than multiprocessing. It can be helpful to try either if you are running into an issue trying to run something in parallel. Serial means to not run things in parallel but instead in series (even if num_procs is >1), which can help with debugging. Threading uses Python threads in the same process, but this is not recommended for most use cases due to what is known as GIL contention.

    • verbose name: Multiprocess Backend
    • default value: serial
    • port type: EnumPort
    • value type: str (can be None)
  • serial_if_debugger
    If True, then if the Python debugger is detected, the node will run in serial mode, even if multiprocess_backend is set to something else. This is useful for debugging, since the debugger does not work well with parallel processes. This can be disabled if certain steps should nevertheless run in parallel (e.g., to reach a breakpoint more quickly).

    • verbose name: Serial If Debugger
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • random_seed
    Seed for any pseudo-random choices during training. This can be either a splittable seed as generated by Create Random Seed or a plain integer seed.

    • verbose name: Random Seed
    • default value: 12345
    • port type: Port
    • value type: AnyNumeric
  • verbosity
    Verbosity level for diagnostics.

    • verbose name: Verbosity
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

HierarchicalDiscriminantComponentAnalysis

Use Hierarchical Discriminant Component Analysis (HDCA) to classify data instances.

This node learns a hierarchical linear model that is usually applied to band-pass filtered and segmented neural data, for example EEG. When applied to such data, the method learns a model of the time-domain (event related potential like) signal to explain the dependent variable (e.g., to classify the EEG). HDCA can also be thought of as a hierarchical version of linear discriminant analysis (LDA) where first LDA is applied to each time slice individually, reducing the data to a single dimension per time slice, and then a second-level classifier (usually again an LDA) is applied to the resulting set of 1d features. HDCA is known to perform well on EEG segments at full resolution (i.e., little to no reduction in channel desity and little to no reduction in sampling rate) compared to many other methods since it is agnostic to (often spurious) correlations between channels at different time points. In NeuroPype, HDCA is usually used in conjunction with shrinkage regularization (using shrinkage LDA or sLDA) as the building block and has good defaults. It can also be generalized to use other types of linear models, such as logistic regression or ridge regression, which may have a different robustness profile or can be used as a stepping stone to simulate the behavior of more complex models (e.g., using the ConvexModel node). The regression mode is useful for use with continuous labels. Another generalization is to multi-stream data (e.g., from multiple modalities such as EEG and eye tracking), where the data is fused either at the level of temporal features or at the level of streams (the latter yields a 3-level hierarchy).

Version 1.1.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • shrinkage_within
    Regularization strength within each time slice. If using 'auto', the parameter is estimated automatically (using the Ledoit-Wolf method for discriminant-type models, GCV for regression, and k-fold cross-validation for logistic models). Otherwise, this is a number (0-1 for discriminant or typically 1e-3 to 1e3 for the other models).

    • verbose name: Shrinkage Within
    • default value: 0.01
    • port type: Port
    • value type: object
  • shrinkage_across
    Regularization strength across time slices and/or modalities. Other details same as documented in the shrinkage within parameter.

    • verbose name: Shrinkage Across
    • default value: 0.01
    • port type: Port
    • value type: object
  • model_within
    Type of linear model used within each time slice. See also model_across for model types that operate across time slices. The discriminant option is a shrinkage linear discriminant analysis (sLDA), while the regression option is a ridge regression formulation, and logistic is l2-regulared logistic regression. Note that, even though discriminant and logistic typically output discrete values, in this setting only the learned linear mapping is applied, yielding continuous-valued features. Note that for models other than discriminant, it can be useful to scale features prior to this node, especially if the feature scale is very far from unity or very uneven. Huber, Theil-Sen, RANSAC, and RLDA are robusts variant of regression with different performance characteristics, of which Huber should be the first choice. The parameter for Huber is the cutoff in standard deviations beyond which residuals are treated as outliers, and defaults to a value that yields 95% statistical efficiency. The parameters for Theil-Sen are the maximum number of subpopulations to consider and the number of subsamples to draw; this estimator can be quite slow (e.g., 10x slower than Huber). The parameters for RANSAC are the name of the underlying base estimator, which can be 'ols', 'discriminant', 'regression', or 'logistic', optionally the minimum fraction or count of samples to use (unless this is 'ols', this argument must be given), and optionally the threshold in terms of target value error at or below which a data point is flagged as an inlier (defaulting to MAD(TargetValue)). The minimum fraction is a fairly important parameter since it cannot be too small (yielding an ill-posed estimator) or too large (being unlikely to avoid outliers). RLDA is an experimental robust LDA method whose parameters are the covariance estimator ('mcd', 'sgd', 'batch'), the huber threshold (ignored by the mcd method), and optionally an algorithm-specific parameter, which is the base learning rate for the sgd method, the block size for the batch method, and the maximum contamination fraction for the mcd method. One may also specify a separate value per stream by giving this parameter as a dictionary with modality names as keys and model types as values. One may also use the '~' key to specify a catch-all model type for stream types not otherwise listed, as in {'eeg': 'discriminant', '~': 'huber(1.35)'}.

    • verbose name: Model Within
    • default value: discriminant
    • port type: ComboPort
    • value type: str (can be None)
  • model_across
    Type of (generalized) linear model to learn across time slices and/or modalities; see also model_within for model types that apply within each time slice. The discriminant analyis is sLDA (see model_within), regression is a ridge regression, and logistic is a logistic regression. Of these, only the ridge regression is suitable for continuous labels. Huber is a robust variant of regression. This can also be a per-stream setting; in this case, and assuming that there are multiple streams, there will be a hierarchy layer that goes across streams (whether temporal when doing temporal multi-stream fusion, or late when doing late fusion); in either case, the model form for this multi-stream "pseudo-modality" can be specified using a dedicated key named '*' (otherwise this defaults to logistic).

    • verbose name: Model Across
    • default value: discriminant
    • port type: ComboPort
    • value type: str (can be None)
  • fusion
    Multi-stream fusion strategy. The temporal strategy will fuse multiple streams (if any) at the level of temporal features, and allow the second-level classifier to learn interactions between streams in terms of their detailed time course. The late strategy will fuse streams at the end ('late fusion'), in which case each stream is reduced to a single score, and a third-level classifier is applied to those scores. If each stream stems from a different modality, then this is the multi-modal fusion strategy. The skip option causes the node to skip the fusion step and output the features generated by the temporal level.

    • verbose name: Multi-Stream Fusion
    • default value: temporal
    • port type: EnumPort
    • value type: str (can be None)
  • probabilistic
    Output probabilities instead of class labels.

    • verbose name: Probabilistic
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • class_weights
    Per-class weight (dictionary). This is formatted as a Python-style dictionary that assigns to each numeric class (e.g., 0, 1) a weighting (e.g., 1.0). The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Class Weights
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • tolerance
    Convergence tolerance. Smaller values result in more accurately estimated models at the cost of performance. The exact meaning depends on the model type used (for discriminant analysis, this is the accuracy of rank estimation in the SVD and is an absolute threshold, meaning that it depends on the scale of the data).

    • verbose name: Tolerance
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • solver_type
    Type of solver to use. Auto is usually a good choice for EEG data. For logistic regression, other solvers, incl. newton-cg, newton-cholesky, and liblinear are available and could help performance.

    • verbose name: Solver Type
    • default value: auto
    • port type: ComboPort
    • value type: str (can be None)
  • solver_maxiter
    Max number of iterations for use with an iterative solver. This currently applies only to the logistic formulation, in which case the default is 100.

    • verbose name: Max Iterations
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • solver_extra
    Extra solver options. This is a dictionary that can be used to pass additional options to the solver. The exact options depend on the solver used.

    • verbose name: Solver Extra
    • default value: None
    • port type: DictPort
    • value type: dict (can be None)
  • initialize_once
    If set to True, the model is trained once.

    • verbose name: Initialize Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • backend
    Compute backend to use. This is only used by a few solvers (specifically the rlda method); selecting torch will use a multi-core implementation that can be faster than numpy, but is not necessarily more efficient in terms of compute per core.

    • verbose name: Backend
    • default value: numpy
    • port type: EnumPort
    • value type: str (can be None)
  • verbose
    Enable verbose output.

    • verbose name: Verbose
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

InjectCalibrationData

Insert calibration data into the stream.

This node is useful when you have a pipeline that receives some data from a data source (e.g., an LSL Input node) and performs some processing on the data (e.g., make predictions using some machine learning nodes) -- and this pipeline requires calibration on some previously recorded calibration data. This node can be inserted between the data source and the first adaptive processing step in your pipeline, and what it allows you to do is as follows: the node has a second input port, which you can wire to some node(s) that import your calibration data (e.g., Import XDF). Then, the Inject Calibration Data node will first let through the calibration recording, which will then trickle down your subsequent processing pipeline, giving every node a chance to calibrate itself. After that, the Inject Calibration Data node will let through the regular streaming data from your actual data source, so that the pipeline can do its regular processing, now that it is calibrated. Note that, since on the first tick this node outputs the calibration data, the question arises what happens to any streaming data that also came into the Inject Calibration Data node on that same first tick -- you can choose to either emit it on the next tick, and thus delay this and all subsequent streaming packets by one tick (1/25th of a second at NeuroPype's default update rate), or you can choose to drop it. Tip: if you want to collect the calibration data on the fly at the beginning of a real-time session instead of using a previous recording, you can instead use the Accumulate Calibration Data node.

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • streaming_data
    Streaming data.

    • verbose name: Streaming Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: IN
  • calib_data
    Calibration data.

    • verbose name: Calib Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: IN
  • data
    Output data.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • delay_streaming_packets
    Whether streaming packets should be delayed by one tick, or whether the first streaming packet should be dropped. If enabled, the first streaming packet will be buffered by this node, and emitted on the next tick (since on the first tick this node outputs the calibration data); all subsequent streaming packets will naturally also have to be delayed by one tick. If disabled, then the first streaming packet will be dropped, and there is no delay. For streaming processing, it is usually best to drop the packet, since incoming streaming data can generally not be acted on before the pipeline is finished calibrating. However, if the data on the streaming input port is actually a single packet holding a whole recording that shall be processed in an offline fashion, then it must be delayed by one tick, since dropping it would drop all the data.

    • verbose name: Delay Streaming Packets
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

KNNImputation

Impute missing data with an (optionally weighted) average of the k nearest neighbors (KNN).

This method is useful for filling in missing data (encoded by the presence of NaNs in a multivariate manner. The method is stateful and will by default learn an imputation model on the first data that it is called with, and then apply this model on subsequent invocations. .

More Info...

Version 0.8.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • domain_axes
    Axes across which information may be "moved" when imputing missing data. These represent the input domain across which k-NN imputation is applied. Multiple axes may be given by providing a comma-spearated list here; see also the predefined drop-down options. The special value (all others) stands for all axes that are not listed in any of the other two listings.

    • verbose name: Impute Across Axes
    • default value: (all others)
    • port type: ComboPort
    • value type: str (can be None)
  • aggregate_axes
    Axes that have the statistical observations in them. The elements along these axes are treated as the trials or samples that provide redundant observations of the same underlying distribution. See also the previous setting. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: ComboPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate imputation models. This can be used if it is known that data at one element of this axis is independent from and shares no information with data at another element, so that imputation is best performed separately for each element along this axis. This can also be used if correlations are merely weak across this axis, so that separate models are more accurate.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: ComboPort
    • value type: str (can be None)
  • num_neighbors
    Number of neighbors to use for imputation. This is the k in k-NN.

    • verbose name: Number Of Neighbors
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • weighting
    Weighting scheme for neighbors. The uniform scheme gives equal weight to all neighbors, while the distance scheme gives weight inversely proportional to the distance to the neighbor. Distance can be more accurate, but in high-dimensional models or with low numbers of neighbors, this can perform less well than the uniform scheme.

    • verbose name: Weighting
    • default value: uniform
    • port type: EnumPort
    • value type: str (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LASSORegression

Estimate a continuous output value from features using LASSO Regression.

LASSO regression is a straightfoward and principled statistical technique to learn a linear mapping between input data and desired output values from training data. LASSO differs from ridge regression in that LASSO will identify and select a small number of relevant features, whereas ridge regression will usually weight every feature to some extent, however small. As a result, LASSO regression can be used when there is a very large number of irrelevant features in the data, as along as the relevant features are relatively few. This is also called sparsity regularization. If there are very few trials, or some extensive stretches of the data exhibit only one class, the procedure used to find the optimal regularization strength (cross-validation) can fail with an error that there were too few or no trials of a given class present. LASSO regression assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for disrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other (sparse) linear methods. There are several other nodes that implement similar methods that have advantages in specific circumstances. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_alphas
    Number of values in regularization search grid. This method determines the optimal regularization strength by testing a number of different strength values between the minimum and maximum value. The running time increases with a finer search grid, but the found solution may be slightly better regularized when using a fine grid (e.g., 100 values) instead of a coarse grid (e.g., 20 values).

    • verbose name: Number Of Values In Reg. Search Grid
    • default value: 100
    • port type: IntPort
    • value type: int (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • min_alpha
    Minimum regularization strength. This is expressed as a factor of the maximum regularization strength, which is calculated from the data. By default, the optimal strength will be searched between this value and the maximum.

    • verbose name: Minimum Regularization Strength
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • alphas
    Override regularization strength. Optionally the default regularization search grid can be overridden by giving an explicit list of values here, although this is rarely necessary or very helpful. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Override Regularization Strength
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • selection
    Parameter update schedule. Using random here can be significantly faster when higher tolerance values are used.

    • verbose name: Parameter Update Schedule
    • default value: cyclic
    • port type: EnumPort
    • value type: str (can be None)
  • positivity_constraint
    Constrain weights to be positive. This is a special (and rarely-used) flavor of this method, in which the learned weights are constrained to be positive.

    • verbose name: Constrain Weights To Be Positive
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • precompute
    Precompute shared data. Precompute some shared data that is reused during parameter search. Aside from 'auto', this can be set to True or False. Auto attempts to determine the best choice automatically.

    • verbose name: Precompute Shared Data
    • default value: auto
    • port type: Port
    • value type: object (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LarsRegression

Estimate a continuous output value from features using Least-Angle (LARS) Regression.

LARS regression is a straightfoward and principled statistical technique to learn a linear mapping between input data and desired output values from training data. LARS is very similar to LASSO, but tends to be faster when the number of features is much higher than the number of data points. LASSO regression can be used when there is a very large number of irrelevant features in the data, as along as the relevant features are relatively few. This is also called sparsity regularization. If there are very few trials, or some extensive stretches of the data exhibit only one class, the procedure used to find the optimal regularization strength (cross-validation) can fail with an error that there were too few or no trials of a given class present. LARS regression assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for disrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other (sparse) linear methods. There are several other nodes that implement similar methods that have advantages in specific circumstances. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_folds
    Number of cross-validation folds for parameter search or instance variable to split by (LOO). Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training. Setting this to one will yield leave-one-out CV, and if a group field was specified, then leave-one-group-out CV. A now-derepcated use was to instead pass the name of the group field directly into num_folds to achieve the same.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: Port
    • value type: object (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • num_alphas
    Number of values in regularization search grid. This method determines the optimal regularization strength by testing a number of different strength values between the minimum and maximum value. The running time increases with a finer search grid, but the found solution may be slightly better regularized when using a fine grid (e.g., 100 values) instead of a coarse grid (e.g., 20 values).

    • verbose name: Number Of Values In Reg. Search Grid
    • default value: 100
    • port type: IntPort
    • value type: int (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 500
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • precompute
    Precompute shared data. Precompute some shared data that is reused during parameter search. Aside from 'auto', this can be set to True or False. Auto attempts to determine the best choice automatically.

    • verbose name: Precompute Shared Data
    • default value: auto
    • port type: Port
    • value type: object (can be None)
  • epsilon
    Degeneracy regularization. This parameter can be used to ensure that the underlying computation does not fail with singular results. Can be increased in cases where the number of features is very high compared to the number of observations.

    • verbose name: Epsilon
    • default value: 2.220446049250313e-16
    • port type: FloatPort
    • value type: float (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LearningMethod

Represents a learning method implemented by a graph.

This is a somewhat ancient node that is not used much at this time; it is kept for backward compatibility and will eventually be superseded by an import system. This node can be used to instantiate a sub-graph that implements a learning method inside another graph. The sub-graph must have an input and an output port node of name 'data'

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • filename
    Name of the patch file to load.

    • verbose name: Filename
    • default value: None
    • port type: StringPort
    • value type: str (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LinearDiscriminantAnalysis

Classify data instances using Linear Discriminant Analysis (LDA).

The LDA method is a fast statistical method that learns a linear mapping from input data to discrete category labels. LDA assumes that the data are Gaussian-distributed, that is, have no or very few major statistical outliers. To the extent that these assumptions hold true, this method is highly competitive with other linear (or generalized linear) methods. To ameliorate the outlier issue, the raw data can be cleaned of artifacts with various artifact removal methods. This implementation uses shrinkage regularization by default, which allows it to handle large numbers of features quite gracefully, compared to the unregularized variant, which will overfit or fail on more than a few dozen features. This method can be implemented using a number of different numerical approaches which have different tradeoffs -- it is worth experimenting with different choices of the solver parameter if you cannot get the result that you seek. By default, this method will return not the most likely class label for each trial it is given, but instead the probabilies for each class (=category of labels), that the trial is actually of that class. This method can also optionally use a regularization parameter that is tuned using an internal cross-validation on the data. If there are very few trials, or some extensive stretches of the data exhibit only one class, this cross-validation can fail with an error that there were too few or no trials of a given class present. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • shrinkage
    Shrinkage regularization strength. If using 'auto', then a fast automatic method is used to determine the regularization parameter (using the Ledoit-Wolf method). However, if given as a list of numbers between 0 and 1 (formatted as in, e.g., [0.1, 0.2, 0.3]) where 0 is no regularization and 1 is maximal regularization, then the best parameter is searched, which can be slow. The details of the parameter search can be controlled via the search metric and number of folds parameters.

    • verbose name: Shrinkage Level
    • default value: auto
    • port type: Port
    • value type: object (can be None)
  • feature_selection
    Feature selection criterion to use if sparse solutions are desired. If set to None, no feature selection is enabled. If set to rfe, recursive feature elimination is used. If set to anova, features with the highest ANOVA f-score are selected. In either case, the number of features is automatically tuned using a cross-validation, and can be capped using max_feature_select.

    • verbose name: Feature Selection
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • max_feature_select
    Maximum number of features to retain, if feature selection is used. Using a value that is less than the number of features can be much faster than searching over all possible counts.

    • verbose name: Max Number Of Features
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • feature_sel_group_size
    Number of successive features that form a group, for grouped feature selection.

    • verbose name: Group Size (Feature Sel.)
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • feature_sel_group_op
    Operation to use to calculate groupwise feature importance.

    • verbose name: Group Measure (Feature Sel.)
    • default value: max
    • port type: EnumPort
    • value type: str (can be None)
  • search_metric
    Parameter search metric to optimize certain hyper-parameters of the node. When certain parameters (e.g., shrinkage or robust_gamma) are used and given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. Therefore, when a search is done the running time of the method is multiplied by the number of parameter values and the number of folds in the cross-validation, which can be slow. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block. This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are Subject_id, Session_id, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • num_jobs
    Number of parallel compute jobs. This is only in effect when a parameter search is applied (e.g., in robust mode, or when passing shrinkage as a list). This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • solver
    Solver to use. This node supports formulations based on a least-squares solution (lsqr), eigenvalue decomposition (eigen), and singular-value decomposition (svd). Some of these methods are known to have numerical issues under various circumstances -- consider trying different settings if you cannot achieve the desired results. Note: the svd method can handle many features, but does not support shrinkage-type regularization.

    • verbose name: Solver To Use
    • default value: eigen
    • port type: EnumPort
    • value type: str (can be None)
  • tolerance
    Threshold for rank estimation in SVD. Using a larger value will more aggressively prune features, but can make the difference between it working or not.

    • verbose name: Rank Estimation Threshold (Sgd Solver)
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • robust_method
    Type of robust estimator to use, if any. This applies to the covariance matrix only at this point. The MCD method is a slow but accurate estimator using the minimum covariance determinant. The SGD method is a fast estimator using stochastic gradient descent, but requires some parameter tuning (automatic tuning is slow, but a single value can usually be determined for a given study and then used going forward). The SAG method uses stochastic average gradient and tends to be more resilient to bad parameter settings, but is otherwise similar to SGD. Both SGD and SAG are configured via the huber_threshold and robust_gamma parameters. Thus one may use MCD first to determine if robustness is helpful on the given data, and if fast runtime is needed, one can switch to SAG or SGD and tune the parameters until a similar performance is achieved. Note that, if shrinkage is set to 'auto', the shrinkage amount will generally be estimated in a non-robust fashion, which may cause some amount of over- or under-shrinkage.

    • verbose name: Robust Method
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • huber_threshold
    Huber threshold, applies only when using the SGD robust estimator. If set to 0, the geometric (or l1) median is used (which is maximally robust but has low statistical efficiency). If set to 'auto' or the empty list, a value will be searched. An if given as a list of values, the list will be used as the search range.

    • verbose name: Huber Threshold (Sag/sgd Method)
    • default value: 0
    • port type: Port
    • value type: object (can be None)
  • robust_gamma
    Search range for gamma parameter when using the SGD robust estimation. Note that typically a much smaller range, or even a single value (which requires no built-in grid search), will do, but the value needs to be empirically determined.

    • verbose name: Learning Rate (Sag/sgd Method)
    • default value: [1, 1.25, 1.5, 2, 2.5, 5, 10, 20, 40, 80, 160]
    • port type: ListPort
    • value type: list (can be None)
  • robust_max_contamination
    Maximum contaminated data fraction for MCD robust estimator. This is a tradeoff between resistance to larger proportions of bad data (the maximum is around 0.5, and can be used by setting the parameter to its default None value), and higher statistical efficiency since more data is used for the estimate.

    • verbose name: Max Contamination (Mcd Method)
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • dimensionality_reduction
    Reduce dimensionality of data to the given number of dimensions. If given, this overrides the regular outputs of LDA (governed by the probabilistic flag), and causes it to produce data of the desired number of dimensions.

    • verbose name: Dimensionality Reduction
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LinearSupportVectorClassification

Use linear support vector machines to classify data instances.

This is the linear version of the support vector classification. As such, it will not be able to exploit non-linear structure, but on data that has little to no dominant non-linear features, it will perform just as well as non-linear ("kernel") SVMs, and do so in a computationally more efficient way. Linear SVMs are quite similar to logistic regression, and can be more robust in the presence of outliers (when using the default hinge loss), though on the other hand the probability estimates produced by SVM are not as theoretically straightforward and well-motivated as those of logistic regression. This implementation uses regularization by default, which allows it to handle large numbers of features very well. Importantly, there are two types of regularization that one can choose from: the default l2 regularization is a fine choice for most data (it is closely related to shrinkage in LDA or a Gaussian prior in Bayesian methods). The alternative l1 regularization is unique in that it can learn to identify a sparse subset of features that is relevant while pruning out all other features as irrelevant. This sparsity regularization is statistically very efficient and can deal with an extremely large number of irrelevant features. To determine the optimal regularization strength, a list of candidate parameter settings can be given, which is then searched by the method using an internal cross-validation on the data to find the best value. If there are very few trials, or some extensive stretches of the data exhibit only one class, this cross-validation can fail with an error that there were too few or no trials of a given class present. Also, the default search grid for regularization (i.e., the list of candidate values) is deliberately rather coarse to keep the method fast. For higher-quality results, use a more fine grained list of values (which will be correspondingly slower). This method can be implemented using a number of different numerical approaches which have different running times depending on the number of data points and features. If you are re-solving the problem a lot, it can make sense to try out the various solvers to find the fastest one. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • loss
    Loss function to use. This selects the data term, i.e., what assumptions are being imposed on the data. l1 is the hinge loss (standard SVM), which has a certain level of robustness to outliers. l2 is the squared hinge loss which is like hinge but is quadratically penalized and therefore in some sense more regression-like.

    • verbose name: Data Term
    • default value: l1
    • port type: EnumPort
    • value type: str (can be None)
  • regularizer
    Regularization type. The default l2 regularization is a good choice for most data. The alternative l1 regularization results in a type of "feature selection", that is, only a sparse set of features will have a non-zero weight, and all other features will remain effectively unused. This sparsity regularization is very useful if only a small number of features are relevant. It is nevertheless a good idea to compare to the performance of using l2 regularization, since excessive sparsity can sometimes degrade performance.

    • verbose name: Regularization Type
    • default value: l2
    • port type: EnumPort
    • value type: str (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • cost
    SVM cost parameter. This value determines the degree to which solutions are penalized that would mis-classify data points. Higher values result in models that are less likely to mis-classify the training data, but at the expense of potentially worse generalization to new data (less margin for error when slightly different trials are encountered in future test data). This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A very small value means effectively no penalty, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Cost
    • default value: [0.01, 0.1, 1.0, 10.0, 100]
    • port type: Port
    • value type: list (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • dual_formulation
    Use dual formulation. This is an alternative way to solve the problem. If enabled, it can be faster when the number of trials is larger than the number of features, but it is not supported with l1 regularization.

    • verbose name: Use Alternative Dual Formulation
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • bias_scaling
    Scale for bias term. Since this implementation applies the regularization to the bias term too (which is usually not ideal, although rarely a significant issue), you can use this scale to counter the effect.

    • verbose name: Bias Scaling
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes. It is not recommended to ever touch this since manipulation the random seed does generally not result in robust improvements.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LinearToProbabilities

Convert linear predictions to (two-class) pseudo-probabilities.

This node can be chained after e.g., ApplyLinearTransform if that node is meant to implement a linear classifier. The resulting data format is then the same as generated by the other classification nodes. Note that this node assumes that your linear scores are signed, i.e., negative scores encode that the first or negative class ("class 0") is more probable and positive scores encode that the second or positive class ("class 1") is more probable, and that they are in a somewhat reasonable range (e.g., the two class means mapping to -1 and +1). The resulting scores will be in a 0-1 range and somewhat behave like probabilities, but be aware that they will not be properly calibrated and thus this procedure should be viewed as a cheap "trick" to get probabilities. For an accurate transformation, use the Probability Calibration node instead.

More Info...

Version 0.8.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • replace_axis
    Replace prior feature axis (if applicable). This will check if the data contains a prior one-element (i.e., dummy) feature axis and replace it with the new two-class axis. This can be necessary for downstream nodes to recognize the output as well-formed two-class probabilistic predictions.

    • verbose name: Replace Prior Axis
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

LogisticRegression

Classify data instances using regularized Logistic Regression.

The logistic regression method is a versatile and principled statistical method that learns a generalized linear mapping from input data to the probability of that data belonging to one of several possible classes. This method is highly competitive with other linear or generalized linear methods, such as LDA and SVM. Logistic regression makes relatively weak assumptions on the distribution of the data, so that it is moderately tolerant to outliers, especially when compared to relatively brittle methods like LDA. Logistic regression is not considered to be quite as robust as SVM (therefore it can make sense to remove artifacts beforehand using the appropriate nodes if the data are very noisy) -- however, the main benefit of logistic regression over SVM is that the outputs can be straightforwardly interpreted as probabilities without having to resort to 'tricks' such as probability calibration. This implementation uses regularization by default, which allows it to handle large numbers of features very well. Importantly, there are two types of regularization that one can choose from: the default l2 regularization is a fine choice for most data (it is closely related to shrinkage in LDA or a Gaussian prior in Bayesian methods). The alternative l1 regularization is unique in that it can learn to identify a sparse subset of features that is relevant while pruning out all other features as irrelevant. This sparsity regularization is statistically very efficient and can deal with an extremely large number of irrelevant features. To determine the optimal regularization strength, the a list of candidate parameter settings can be given, which is then searched by the method using an internal cross-validation on the data to find the best value. If there are very few trials, or some extensive stretches of the data exhibit only one class, this cross-validation can fail with an error that there were too few or no trials of a given class present. Also, the default search grid for regularization (i.e., the list of candidate values) is deliberately rather coarse to keep the method fast. For higher-quality results, use a more fine grained list of values (which will be correspondingly slower). There are also several other implementations of logistic regression with different regularization terms or different performance characteristics in other nodes. This method can be implemented using a number of different numerical approaches which have different running times depending on the number of data points and features. If you are re-solving the problem a lot, it can make sense to try out the various solvers to find the fastest one. By default, this method will return not the most likely class label for each trial it is given, but instead the probabilies for each class (=category of labels), that the trial is actually of that class. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • feature_scaling
    Feature scaling to use. If set to auto, then scaling will default to robust if a sag or saga solver are used, and otherwise std. If features are not already standardized beforehand, enabling this tends to ensure that the explored range of the alphas parameter is well matched to the data and will also ensure that features are treated equally by the regularization instead of it being scale-dependent. The -scale variants do not also shift the data.

    • verbose name: Feature Scaling
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • regularizer
    Regularization type. The default l2 regularization is a good choice for most data. The alternative l1 regularization results in a type of "feature selection", that is, only a sparse set of features will have a non-zero weight, and all other features will remain effectively unused. This sparsity regularization is very useful if only a small number of features are relevant. It is nevertheless a good idea to compare to the performance of using l2 regularization, since excessive sparsity can sometimes degrade performance.

    • verbose name: Regularization Type
    • default value: l2
    • port type: EnumPort
    • value type: str (can be None)
  • alphas
    Regularization strength. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Regularization Strength
    • default value: [0.1, 0.5, 1.0, 5, 10.0]
    • port type: ListPort
    • value type: list (can be None)
  • l1_ratios
    Elastic-net mixing ratio. 0 is equiv. to l2, and 1 is eqiv. to l1. In-between values yield combinations. Defaults to [1/4, 2/4, 3/4].

    • verbose name: L1/l2 Ratio (Elasticnet Regularizer)
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • solver
    Solver to use. Not all solvers support all regularizers -- specifically, newton-cg, sag, and lbfgs only support l2. Elasticnet is only supported by saga. Also, different algorithms have different running times depending on the number of features, and the number of trials. It can make sense to identify the fastest method if this node is used a lot on similarly-sized data.

    • verbose name: Solver To Use
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • dual_formulation
    Use dual formulation. This is an alternative way to solve the problem. If enabled, it can be faster when the number of trials is larger than the number of features, but it is not supported with l1 regularization.

    • verbose name: Use Alternative Dual Formulation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 100
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • multiclass
    Technique to use when classifying more than two classes of data. These formulations will usually have little impact on the results, but multinomial can be solved in one step, whereas ovr (which stands for one-vs-rest) requires one run per class, which can be slower. Multinomial is only supported when using the lbfgs solver. Auto will auto-select based on the chosen solver.

    • verbose name: Type Of Multiclass Approach
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • bias_scaling
    Scale for bias term. Since this logistic regression implementation applies the regularization to the bias term too (which is usually not ideal, although rarely a significant issue), you can use this scale to counter the effect.

    • verbose name: Bias Scaling
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

MeasureLoss

Measure discrepancy between some predicted labels and true labels.

This is a fully-featured statistics node that is typically used to measure performance of a model, which can compute a number of error metrics, aggregate them over successive ticks, and generally expects target label information to be encoded in fields of the instance axis. For pure mathematical loss functions (which can also be directly optimized for), see the other Loss nodes in this category. This node accepts as input some data that has both estimates or predictions (e.g., from a machine learning node), as well as ground-truth labels (desired outputs). Using these inputs, the node computes a loss function that measures the disagreement between the predictions and the labels. The node supports a large variety of losses that can be used in different circumstances (based on different assumptions about the type of estimates or the severity of different kinds of errors), such as misclassification rate, mean squared error, or area under receiver-operator curve. Be sure to inform yourself about what is the most appropriate loss metric for your use case before trusting the numbers blindly, or e.g., accidentally relying on misleading results (e.g., inapplicable metrics) to make decisions. The node is capable of accumulating results across multiple trials or data chunks, for instance during real-time processing. It is also possible to do the same across multiple datasets in offline or batch processing, by setting the appropriate parameters in the node. This node passes the computed loss out through its loss port (so wire that one into the next node). This will contain either a statistics axis or a feature axis (depending on the output_format parameter) with the following six items: 'N Batches', 'N Trials', 'value', 'p_val', 'se_stat', 't_stat', where 'value' will be the selected loss metric. If more than one loss metric is selected, the data tensor is a matrix with a feature axis (or second feature axis depending on the output_format) will hold the values for each metric. See the documentation for the output_format parameter for more details. (Deprecation notice: )

More Info...

Version 1.5.2

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • loss
    Loss estimates.

    • verbose name: Loss
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • data
    Data to process. Deprecation notice: The incoming data used to be passed through unchanged through this same port, but that behavior has changed and this port is set to IN only.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: IN
  • loss_metrics
    Loss metrics. The following measures can be calculated both offline and online: MCR is misclassification rate (aka error rate), MSE is mean-squared error, MAE is the mean absolute error, Sign is the sign error, Bias is the prediction bias. The following measures are currently only available during offline computations: SMAE is the standardized mean absolute error, SMSE is the standardized mean-squared error, max is the maximum error, RMSE is the root mean squared error, MedSE is the median squared error. MedAE is the median absolute error, SMedAE is the standardized median absolute error, AUC is negative area under ROC curve, R2 is the R-squared loss, CrossEnt is the cross-entropy loss, ExpVar is the negative explained variance, CM is the confusion matrix.

    • verbose name: Loss Metrics
    • default value: []
    • port type: SubsetPort
    • value type: list (can be None)
  • batching
    Batch size for computing measures. If set to None (default), the trials are batched as they come in. If set to 0, all trials are put into a single batch. Other values will use batches of the given trial count. Batching is mostly useful if you are interested in the standard deviation of some measures, or if a measure cannot be computed reliably from few instances (e.g., in case of median, AUC, etc).

    • verbose name: Batching
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • output_format
    Format of the loss output packet. If set to stats-axis, the loss packet will contain a statistics axis with with six items ('N Batches', 'N Trials', 'value', 'p_val', 'se_stat', 't_stat') and a feature axis with one item per selected metric in loss_metrics (with the same name, i.e., 'MSE'). If set to 2-feature-axes, a feature axis with label statistics_types will be created in place of the statistics axis described above, with the same items, and a second feature axis with label explanatory_variables will be created with one item per computed measure. In legacy mode a single feature axis will be created with one item per computed measure plus a N Trials item, and no statistics.

    • verbose name: Output Format
    • default value: legacy
    • port type: EnumPort
    • value type: str (can be None)
  • accumulate_offline
    Accumulate statistics across offline data packets, as well. Normally this node will only accumulate statistics on real-time (streaming) data. By enabling this, it will also accumulate across multiple non-streaming data sets, e.g., in batch processing.

    • verbose name: Accumulate Offline
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • ignore_resets
    Keep accumulating throughout state resets of the pipeline. For instance, after rewiring or changing files. This is most relevant in batch processing, where a pipeline iterates over multiple files. If you intend to aggregate statistics across multiple files, you need to enable this option.

    • verbose name: Ignore Resets
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • cond_field
    The name of the instance data field that contains the true target values. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • ttest_level
    Loss value against which to test for significant differences in a t-test. This only applies if there were multiple batches.

    • verbose name: Ttest Level
    • default value: 0.5
    • port type: FloatPort
    • value type: float (can be None)
  • significance_level
    Significance level for t-test.

    • verbose name: Significance Level
    • default value: 0.05
    • port type: FloatPort
    • value type: float (can be None)
  • log_results
    Log results.

    • verbose name: Log Results
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • suppress_nans
    Suppress NaNs in the loss calculation. Depending on the loss, a NaN prediction may raise an exception.

    • verbose name: Suppress Nans
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • output_for_stats_table
    Output the data in such a way that it can be parsed by the ParseStatsTable node, for creating stats tables. This creates two feature axes instead of one. You can now use the output_format to get more format choices, and additional metrics.

    • verbose name: Output For Stats Table
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • loss_metric
    Legacy loss metric. Will become deprecated in favor of loss_metrics (allowing multiple metrics). Note that you cannot specify both loss_metric and loss_metrics.

    • verbose name: Loss Metric (Legacy)
    • default value:
    • port type: EnumPort
    • value type: str (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

NaiveBayes

Use naive Bayes to classify data instances.

This is a very basic classifier that can work well in situations where one would also use LDA or QDA. This method can handle large numbers of features without problems since it does not account for correlations between those features. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

ProbabilityCalibration

Calibrate continuous outputs of a method to be valid probabilities.

This node acts as a "wrapper" for a machine learning pipeline that some sort of continuous output that has some monotonic relationship with the probability of a class. The node will then learn a one-dimensional mapping from the continuous output to a probability, and the calibration node then acts as a drop-in replacement for the original method. The node is used by wiring a pipeline into the "method" input port, where the pipeline is a subgraph that usually starts with a Placeholder with slotname set to "data"; see also the documentation tooltip for the method signature parameter. For technical reasons, probability calibration should not be performed on the same data that a classifier was trained on, since the classifier will generate untypically high confidence scores on its own training data that will not be representative of the classifier's outputs on new data. Therefore, the calibration node generally uses a cross-validation approach to train the underlying classifier on different data than what is used to fit the calibration mapping (see also cross-validation node for full details on cross-validation is generally done). Since this naturally results in k calibrated classifiers (for k folds), a straightfoward way to combine these classifiers is to average their outputs, which is the default behavior of the node (ensemble set to True). Alternatively, to reduce computational cost at prediction time, one can disable the ensemble option, which will then use a single classifier that is trained on all the data but still uses the calibration mapping trained via the cross-validation, but note that using a classifier trained no more data can introduce some bias (e.g., in test-set confidence) into the predictions and also does not enjoy the robustness conferred by the ensemble approach. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 0.5.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • method
    Method to calibrate.

    • verbose name: Method
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • method__signature
    Argument names of the pipeline being calibrated. Your pipeline is a subgraph that must contain at least one Placeholder node whose slotname must match the argument name listed here. The placeholder then acts as the entry point for any data that is passed into the pipeline when it is invoked by the calibration node. Your pipeline's final node (which typically produces the predictions) is then wired to the calibration node's "method" input port. In graphical UIs, this edge will be displayed in dotted style to indicate that this is not normal forward data flow, but that a subgraph (your pipeline) runs under the control of the calibration node. In summary, your pipeline starts with a Placeholder that is followed by some processing nodes (in the simplest case just a single machine-learning node, such as Linear Discriminant Analysis). The final node of your pipeline is the one whose outputs are taken to be the pipeline's predictions, and this node is wired into the "method" input of the Cross-validation. Any "loose ends" downstream of your placeholder are also considered to be part of the pipeline but do not contribute to the result (they may be used for other purposes, such as printing progress information). Your pipeline may optionally have a second placeholder, which should by convention have slotname set to is_training, and then is_training must be listed as the second argument here. This second placeholder is used to indicate whether your pipeline is currently being called on training data or test data. Regardless of whether you expose this parameter or not. The calibration node will execute your pipeline like a cross-validation, meaning that, for each fold in the cross-validation, your pipeline graph is instantiated from its default (uninitialized) state, and is then called with the training set of that fold. Then, the same graph is called again, but this time with the test set of that fold; it is then up to any adaptive nodes in your pipeline (e.g., machine learning nodes) to adapt themselves on the first call and to make predictions (usually without adapting again) on the second call. The pipeline is discarded after each fold and a new pipeline graph is instantiated (to avoid any unintended train/test leakage).

    • verbose name: Method [Signature]
    • default value: (data)
    • port type: Port
    • value type: object (can be None)
  • mapping
    Type of output mapping. The default 'sigmoid' mapping is a logistic function that maps scores scores to probabilities (logistic regression). The alternative 'isotonic' mapping is a non-parametric method that fits a piecewise constant, strictly increasing function to the data. The latter requires a fairly large number of data points to work well (e.g., >1000), but is more flexible and can capture more complex relationships between scores and true probabilities.

    • verbose name: Probability Mapping
    • default value: sigmoid
    • port type: EnumPort
    • value type: str (can be None)
  • ensemble
    Use an ensemble of classifiers for prediction. If this is set, the method will train one classifier per fold, calibrate it on the respective test fold, and at prediction time this will average the predictions of all the classifiers. This is the recommended setting as it can result in somewhat better calibration (e.g. lower bias) and robustness, but is also more costly at prediction time. If disabled, a single classifier is used that is trained on all the data, a single calibration mapping trained via the cross-validation.

    • verbose name: Use Ensemble
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • enabled
    Whether to enable probability claibration. If disabled, the predictions of the underlying model will be passed through unmodified.

    • verbose name: Enabled
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training. This can also be set to 0 if no cross-validation is desired, in which case the method and calibration mapping are trained on all the data, or to 1 to use a (perhaps costly) leave-one-out cross-validation.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

QuadraticDiscriminantAnalysis

Classify data instances using Quadratic Discriminant Analysis (QDA).

The QDA method is a relatively simple statistical method that learns a quadratic mapping (i.e., a basic non-linear mapping) from input data to discrete category labels. QDA assumes that the data are Gaussian-distributed, that is, have no or very few major statistical outliers. To the extent that these assumptions are violated, this method will likely underperform. To ameliorate the outlier issue, the raw data can be cleaned of artifacts with various artifact removal methods. This implementation uses shrinkage regularization by default, which allows it to handle large numbers of features quite gracefully, compared to the unregularized variant, which will overfit or fail on more than a few dozen features. This method relies on a regularization parameter that is tuned using an internal cross-validation on the data. If there are very few trials, or some extensive stretches of the data exhibit only one class, this cross-validation can fail with an error that there were too few or no trials of a given class present. By default, this method will return not the most likely class label for each trial it is given, but instead the probabilies for each class (=category of labels), that the trial is actually of that class. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • shrinkage
    Regularization strength. Given as a list of numbers between 0 and 1 where 0 is no regularization and 1 is maximal regularization. The best parameter is searched, which can be slow. The details of the parameter search can be controlled via the search metric and number of folds parameters.

    • verbose name: Regularization Strength
    • default value: [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.8]
    • port type: ListPort
    • value type: list (can be None)
  • search_metric
    Parameter search metric. This method will run a cross-validation for each possible shrinkage parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. Therefore, the running time of the method is proportional to the number of parameter values times the number of folds in the cross-validation, which can be slow. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric For Parameter Search
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block. This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

RegularizedLogisticRegression

Classify data instances using regularized Logistic Regression with complex regularization terms.

The logistic regression method is a versatile and principled statistical method that learns a generalized linear mapping from input data to the probability of that data belonging to one of several possible classes. This method is highly competitive with other linear or generalized linear methods, such as LDA and SVM. In comparison to the simple logistic regression node, the emphasis of this implementation is on the support for complex regularization terms, in particular group sparsity and trace-norm regularization. When these features are not needed, it is recommended to consider using the simple node instead, as the training time may be shorter, and the parameter settings may allow for more timely or robust convergence on a wide range of data. Logistic regression makes relatively weak assumptions on the distribution of the data, so that it is moderately tolerant to outliers, especially when compared to relatively brittle methods like LDA. Logistic regression is not considered to be quite as robust as SVM (therefore it can make sense to remove artifacts beforehand using the appropriate nodes if the data are very noisy) -- however, the main benefit of logistic regression over SVM is that the outputs can be straightforwardly interpreted as probabilities without having to resort to 'tricks' such as probability calibration. This implementation uses regularization by default, which allows it to handle large numbers of features very well. This implementation supports several types of regularization, including l1, l2, l1/l2, and trace. The basic l2 regularization is a fine choice for most data (it is closely related to shrinkage in LDA or a Gaussian prior in Bayesian methods). The alternative l1 regularization is unique in that it can learn to identify a sparse subset of features that is relevant while pruning out all other features as irrelevant. This sparsity regularization is statistically very efficient and can deal with an extremely large number of irrelevant features. The l1/l2 regularization is a combination of the two, which can identify groups of features that are irrelevant (also known as group sparsity). In this implementation, sparsity acts on the features as if they formed a matrix, and will prune entire rows or columns of the matrix. In order to be able to do this, the input data tensor should have more than one axis that can be used as features, and these axes need to be identified via the group_axis parameter. The trace regularization mode also views the features as a matrix, but it will learn a weighting such that the corresponding matrix of weights per feature is low rank. As such, it will learn a weighting that can be decomposed into a sum of a small number of pairs of weight profiles for the two matrix axes, e.g., spatial and temporal profiles if features were space by time, which are multiplied together to each form a rank-1 matrix. To determine the optimal regularization strength, the a list of candidate parameter settings can be given, which is then searched by the method using an internal cross-validation on the data to find the best value. If there are very few trials, or some extensive stretches of the data exhibit only one class, this cross-validation can fail with an error that there were too few or no trials of a given class present. Also, the default search grid for regularization (i.e., the list of candidate values) is deliberately rather coarse to keep the method fast. For higher-quality results, use a more fine grained list of values (which will be correspondingly slower). There are also several other implementations of logistic regression with different regularization terms or different performance characteristics in other nodes. This method can be implemented using a number of different numerical approaches which have different running times depending on the number of data points and features. If you are re-solving the problem a lot, it can make sense to try out the various solvers to find the fastest one. By default, this method will return not the most likely class label for each trial it is given, but instead the probabilies for each class (=category of labels), that the trial is actually of that class. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.2

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • penalty
    Regularization type. The default trace regularization is a good choice for most data. The trace norm yields low-rank solutions, and l1/l2 yields group-wise sparse solutions. The others are mostly included for completeness; it is more efficient to use LogisticRegression to apply these.

    • verbose name: Regularization Type
    • default value: trace
    • port type: EnumPort
    • value type: str (can be None)
  • group_axes
    Axes over which the penalty shall group. For the trace norm, 2 axes must be given. WARNING: this parameter is not safe to use with untrusted strings from the network.

    • verbose name: Group Axes
    • default value: space, time
    • port type: StringPort
    • value type: str (can be None)
  • lambdas
    Regularization strength. Stronger regularization makes it possible to fit a more complex model (more parameters), or use less data to fit a model of same complexity.

    • verbose name: Regularization Trength
    • default value: [0.1, 1.0, 5.0, 10.0]
    • port type: Port
    • value type: list (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric If Searching Parameters
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • inner_gtol
    "LBFGS convergence tolerance. Larger values give less accurate results but faster solution time.

    • verbose name: Inner Gtol
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • inner_max_iter
    LBFGS maximum number of iterations for inner solver. Additional stopping criterion to limit compute time.

    • verbose name: Inner Max Iter
    • default value: 10
    • port type: IntPort
    • value type: int (can be None)
  • abs_tol
    Absolute convergence tolerance. Smaller values will lead the solver run to a closer approximation of the optimal solution, at the cost of increased running time. See also relative tolerance.

    • verbose name: Absolute Tolerance
    • default value: 1e-05
    • port type: FloatPort
    • value type: float (can be None)
  • rel_tol
    Relative convergence tolerance. Smaller values will lead the solver run to a closer approximation of the optimal solution, at the cost of increased running time. In contrast to the absolute tolerance, this value is relative to the magnitude of the regression weights. Note that the used method works best when the desired accuracy is not excessive, and merely a good approximation is sought.

    • verbose name: Relative Tolerance
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • lbfgs_memory
    LBFGS memory length. This is the maximum number of variable-metric corrections tracked to approximate the inverse Hessian matrix.

    • verbose name: Lbfgs Memory
    • default value: 10
    • port type: IntPort
    • value type: int (can be None)
  • init_rho
    Initial value of augmented Lagrangian parameter. This parameter, which is specific to the used solver, can be auto-tuned, which makes the method is relatively robust to the initial value. However, slight adjustments in the 1 to 30 range can reduce the number of iterations required for convergence and thereby the running time. However, choosing a grossly inappropriate value can cause the method to fail to converge, which is easily diagnosed by having a solution that is essentially non-sparse.

    • verbose name: Solver Rho
    • default value: 4
    • port type: FloatPort
    • value type: float (can be None)
  • update_rho
    Auto-tune rho parameter. Whether to update the solver's rho parameter dynamically. This can be used to achieve faster convergence times in highly time-sensitive setups, but there is a modest risk that on some data the solution can 'blow up', although this could potentially be overcome by tuning the other solver parameters related to the rho update logic.

    • verbose name: Auto-Tune Solver Rho
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • rho_threshold
    Rho update trigger threshold. This determines how frequently updates to the solver rho parameter can be triggered. A larger value will lead to rho changing less frequently. This parameter is essentially a tradeoff between the solver adapting more quickly to settings that are optimal for convergence (using a lower value) versus preventing settings from changing too erratically (using a higher value) and thus prompting stalls or convergence failures on difficult data.

    • verbose name: Solver Rho Update Threshold
    • default value: 10
    • port type: FloatPort
    • value type: float (can be None)
  • rho_incr
    Rho update increment factor. When rho is being increased, this is the factor by which it is changed. Larger values can lead to quicker adaptation if the initial value is off or solutions change rapidly, at an increased risk of overshooting.

    • verbose name: Solver Rho Increment Factor
    • default value: 2
    • port type: FloatPort
    • value type: float (can be None)
  • rho_decr
    Rho update decrement factor. When rho is being decreased, this is the factor by which it is divided. Larger values can lead to quicker adaptation if the initial value is off or solutions change rapidly, at an increased risk of overshooting.

    • verbose name: Solver Rho Decrement Factor
    • default value: 2
    • port type: FloatPort
    • value type: float (can be None)
  • over_relaxation
    ADMM over-relaxation parameter. 1.0 is no over-relaxation.

    • verbose name: Over Relaxation
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

RidgeRegression

Estimate a continuous output value from features using Ridge Regression.

Ridge regression is a straightfoward and principled statistical technique to learn a linear mapping between input data and desired output values from training data. Ridge regression assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for disrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other linear methods, such as support vector regression (SVR). This method uses shrinkage regularization by default, which allows it to handle large numbers of features quite well. The regularization depends on a parameter that is automatically tuned using an internal cross-validation procedure (see tooltips for the controlling parameters for more details). Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • alphas
    Regularization strength. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Regularization Strength
    • default value: [0.1, 0.5, 1.0, 5, 10.0]
    • port type: ListPort
    • value type: list (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'neg_mean_squared_error' is usually a good default, some other metrics can be useful under some circumstances, for instance 'neg_mean_absolute_error', which penalizes large deviations less strongly than mse.

    • verbose name: Scoring Metric For Parameter Search
    • default value: neg_mean_squared_error
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training. Setting this to one will yield leave-one-out CV (which, in the case of ridge regression, is extremely efficient using GCV), and if a group field was specified, then leave-one-group-out CV.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

SparseBayesianRegression

Estimate a continuous output value from features using Sparse Bayesian Regression.

Sparse Bayesian regression is a an elegant Bayesian method to learn a linear mapping between input data and desired output values from training data, which is closely related to LASSO regression. The main difference to LASSO regression is that in LASSO regression, there is a tunable parameter that controls how strongly the method should be regularized. This parameter controls how "sparse" the solution is, that is, how many features shall be relied on by the model versus being pruned as irrelevant. This controls the complexity of the solution to prevent over-fitting to random details of the data and therefore to improve the generalization to new data. In LASSO, this parameter is tuned using cross-validation, that is, by empirical testing on held-out data. In the Bayesian variant, there is such a parameter as well, however, the optimal degree of regularization is estimated from the data, as well, in a theoretically clean and principled fashion. This method assumes that both inputs and outputs are Gaussian distributed, that is, have no or very few major statistical outliers. If the output follows a radically different distribution, for instance between 0 and 1, or nonnegative, or discrete values, then different methods may be more appropriate (for instance, classification methods for disrete values). To ameliorate the issue of outliers in the data, the raw data can be cleaned of artifacts with various artifact removal methods. To the extent that the assumptions hold true, this method is highly competitive with other (sparse) linear methods. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

Version 1.1.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • method
    Method to use. RVM is the Relevance Vector Machine, and ARD is the newer Automatic Relevance Determination approach.

    • verbose name: Method
    • default value: RVM
    • port type: EnumPort
    • value type: str (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 300
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • normalize_features
    Normalize features. Should only be disabled if the data comes with a predictable scale (e.g., normalized in some other way).

    • verbose name: Normalize Features
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • alpha_shape
    Alpha shape parameter. This is only included for completeness and usually does not have to be adjusted. This is the shape parameter for the Gamma distribution prior over the alpha parameter (inverse noise variance). By default, this is an uninformative prior.

    • verbose name: Alpha Shape Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • alpha_rate
    Alpha rate parameter. This is only included for completeness and usually does not have to be adjusted. This is the rate parameter for the Gamma distribution prior over the alpha parameter (inverse noise variance). By default, this is an uninformative prior.

    • verbose name: Alpha Rate Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • lambda_shape
    Lambda shape parameter. This is only included for completeness and usually does not have to be adjusted. This is the shape parameter for the Gamma distribution prior over the lambda parameter (inverse variance of the weights). By default, this is an uninformative prior.

    • verbose name: Lambda Shape Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • lambda_rate
    Lambda rate parameter. This is only included for completeness and usually does not have to be adjusted. This is the rate parameter for the Gamma distribution prior over the lambda parameter (inverse variance of the weights). By default, this is an uninformative prior.

    • verbose name: Lambda Rate Parameter
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • pruning_threshold
    Pruning threshold for small weights. Only used by the ARD flavor (note this is the inverse of threshold_lambda in the implementation).

    • verbose name: Pruning Threshold
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

StochasticGradientDescentClassification

Classify data instances using models trained via stochastic gradient descent.

Stochastic gradient descent is an optimization technique that is very fast for very large amounts of training data, since it does not have to touch all the data in each update iteration. This node is rather flexible in that it can implement a variety of data terms, including logistic regression, support vector machines, certain robust classifiers, and some rarely-used variants. These can be combined with a variety of regularization terms to control the complexity of the solution, including shrinkage/ridge regularization (l2), sparsity ( l1), and elastic net (l1+l2). In this sense, this node can mimic the features of several other nodes in NeuroPype, with different tradeoffs in running time that depend on the number of data points and the number of features, where this node is mainly of interest for very large numbers of data points (i.e., trials). See also tooltips of the parameters for toggling these features for more information on the involved tradeoffs, as well as the documentation of the other nodes ( Logistic Regression, and Support Vector Classification). One unique feature of this node is the ability to implement a robust classifier that theoretically can perform better than most other methods if the data do indeed contain outliers. For this reason, removing artifacts from the raw data will make less of a difference when this method is used. Like the other methods, this node includes tuning parameters that are automatically tuned using cross-validation. If there are very few trials, or some extensive stretches of the data exhibit only one class, the procedure used to find the regularization parameters (cross-validation) can fail with an error that there were too few or no trials of a given class present. By default, this method will return not the most likely class label for each trial it is given, but instead the probabilies for each class (=category of labels), that the trial is actually of that class. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.2.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • loss
    Loss function to use. This selects the data term, i.e., what assumptions are being imposed on the data. Hinge yields an SVM, log_loss yields logistic regression, modified_huber is another smooth loss that enables probability outputs and tolerance to outliers, squared_hinge is like hinge but is quadratically penalized. Perceptron is the loss used by the perceptron algorithm. Epsilon_insensitive is equivalent to support vector regression, and squared_epsilon_insensitive is a rarely used hybrid between linear and support vector regression. These regression techniques are here used to classify data instances, though.

    • verbose name: Data Term
    • default value: log_loss
    • port type: EnumPort
    • value type: str (can be None)
  • regularizer
    Regularization term. Selecting l2 (default) is the most commonly used regularization, and is very effective in preventing over-fitting to random details in the training data, thus helping generalization to new data. The l1 method is applicable when only a small number of features in the data are relevant and most features are irrelevant, and this regularization is capable of identifying and selecting the relevant features (although if the optimal model is not sparse, then this will work less well than l2). The elasticnet version is a weighted combination of the l1 and l2, which addresses an issue with l1 where highly correlated features would receive rather arbitrary relative weights.

    • verbose name: Regularization Term
    • default value: l2
    • port type: EnumPort
    • value type: str (can be None)
  • alphas
    Regularization strength. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of over-fitting. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression ( e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Regularization Strength
    • default value: [0.1, 0.5, 1.0, 5, 10.0]
    • port type: ListPort
    • value type: list (can be None)
  • l1_ratio
    Tradeoff parameter between between l1 and l2 penalties when using the elasticnet regularization term. This parameter controls the balance between the l1 regularization term (as in LASSO) and the l2 regularization term (as in ridge regression). A value of 0 leads to exclusive use of l2, and a value of 1 leads to exlusively use of l1. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). In fact, for each setting of this parameter, the entire list of possible values for the regularization strength is tested, so the running is also proportional to the number of those values. The details of the parameter search can be controlled via the search metric and number of folds parameters. By default this is not seached, but a reasonable choice would be [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1].

    • verbose name: Regularization Type Tradeoff Parameter
    • default value: [0.5]
    • port type: ListPort
    • value type: list (can be None)
  • feature_scaling
    Feature scaling to use. If set to auto, then scaling will default to robust if a robust loss is used, and otherwise std. If features are not already standardized beforehand, enabling this tends to ensure that features are treated equally by the regularization instead of it being scale-dependent.

    • verbose name: Feature Scaling
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric If Searching Parameters
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • shuffle
    Shuffle data after each epoch. This is useful if the data comes from a time series with highly correlated successive trials, and is worth experimenting with to reach faster convergence. The main reason why it is disabled by default is that traditional SGD does not enable this by default.

    • verbose name: Shuffle Data
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • warm_start
    Start from previous solution. This allows for online updating of the model.

    • verbose name: Warm Start
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • averaging
    Use averaging. This is an alternative way of performing stochastic gradient descent, which can lead to faster convergence. Can also be an integer greater than 1; in this case, this is the number of samples seen after which averaging kicks in.

    • verbose name: Averaging
    • default value: False
    • port type: Port
    • value type: object (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • epsilon
    Epsilon for epsilon-insensitive and Huber data terms. Note that this depends strongly on the scale of the data. Can be interpreted as the cutoff in data units beyond which data values are treated more robustly (i.e., as potential outliers).

    • verbose name: Epsilon
    • default value: 0.1
    • port type: FloatPort
    • value type: float (can be None)
  • learning_rate_schedule
    Schedule for adapting the learning rate. This is a highly technical setting that allows to implement different forms of stochastic gradient descent. The options correspond to the following rules for computing the learning rate (step size) eta of the method dependent on the current iteration number t, where 'constant' means eta=eta0, 'optimal' yields the formula eta=1.0/(t+t0), and 'invscaling' corresponds to the formula eta=eta0/pow(t,power_t). The main reason to change this would be to replicate some other published implementation.

    • verbose name: Learning Rate Schedule
    • default value: invscaling
    • port type: EnumPort
    • value type: str (can be None)
  • eta0
    Initial learning rate. This is the parameter eta0 in the learning rate schedule formulate.

    • verbose name: Initial Learning Rate (Step Size)
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • power_t
    Exponent in invscaling schedule. This is the exponent power_t in the invscaling learning rate schedule.

    • verbose name: Exponential Falloff (Invscaling Only)
    • default value: 0.25
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

StochasticGradientDescentRegression

Estimate a continuous output value from features using linear regression with using stochastic gradient descent.

Stochastic gradient descent is an optimization technique that is very fast for very large amounts of training data, since it does not have to touch all the data in each update iteration. This node is rather flexible in that it can implement a variety of data terms, including traditional linear regression, robust linear regression, support vector regression, and hybrid variants, which can be combied with a variety of regularization terms to control the complexity of the solution, including shrinkage/ridge regularization (l2), sparsity (l1), and elastic net (l1+l2). In this sense, this node can mimic the features of several other nodes in NeuroPype, with different tradeoffs in running time that depend on the number of data points and the number of features, where this node is mainly of interest for very large numbers of data points (i.e., trials). See also tooltips of the parameters for toggling these features for more information on the involved tradeoffs, as well as the documentation of the other nodes (Ridge Regression, LASSO Regression, Elastic Net Regression, Support Vector Regression), specifically regarding the regularization terms. One unique feature of this node is the ability to implement robust linear regression, which is robust to outliers in the data, and therefore will perform better than most other methods if the data do indeed contain outliers. For this reason, removing artifacts from the raw data will make less of a difference for this method. Like the other methods, this node includes tuning parameters that are automatically tuned using cross-validation. If there are very few trials, or some extensive stretches of the data exhibit only one class, the procedure used to find the regularization parameters (cross-validation) can fail with an error that there were too few or no trials of a given class present. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.2.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • loss
    Loss function to use. This selects the data term, i.e., what assumptions are being imposed on the data. Squared error corresponds to traditional linear regression, huber fitting is robust regression (data may contain outliers), epsilon_insensitive is equivalent to support vector regression, and squared_epsilon_insensitive is a rarely used hybrid between linear and support vector regression.

    • verbose name: Data Term
    • default value: squared_error
    • port type: EnumPort
    • value type: str (can be None)
  • regularizer
    Regularization term. Selecting l2 (default) is the most commonly used regularization, and is very effective in preventing over-fitting to random details in the training data, thus helping generalization to new data. The l1 method is applicable when only a small number of features in the data are relevant and most features are irrelevant, and this regularization is capable of identifying and selecting the relevant features (although if the optimal model is not sparse, then this will work less well than l2). The elasticnet version is a weighted combination of the l1 and l2, which addresses an issue with l1 where highly correlated features would receive rather arbirary relative weights.

    • verbose name: Regularization Term
    • default value: l2
    • port type: EnumPort
    • value type: str (can be None)
  • alphas
    Regularization strength. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of over-fitting. A value of 0 means no regularization, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials, there is a cutoff beyond which all features are weighted by zero, and are thus unused. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression ( e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Regularization Strength
    • default value: [0.1, 0.5, 1.0, 5, 10.0]
    • port type: ListPort
    • value type: list (can be None)
  • l1_ratio
    Tradeoff parameter between between l1 and l2 penalties when using the elasticnet regularization term. This parameter controls the balance between the l1 regularization term (as in LASSO) and the l2 regularization term (as in ridge regression). A value of 0 leads to exclusive use of l2, and a value of 1 leads to exlusively use of l1. This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). In fact, for each setting of this parameter, the entire list of possible values for the regularization strength is tested, so the running is also proportional to the number of those values. The details of the parameter search can be controlled via the search metric and number of folds parameters. By default this is not seached, but a reasonable choice would be [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1].

    • verbose name: Regularization Type Tradeoff Parameter
    • default value: [0.5]
    • port type: ListPort
    • value type: list (can be None)
  • feature_scaling
    Feature scaling to use. If set to auto, then scaling will default to robust if a robust loss is used, and otherwise std. If features are not already standardized beforehand, enabling this tends to ensure that features are treated equally by the regularization instead of it being scale-dependent.

    • verbose name: Feature Scaling
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • shuffle
    Shuffle data after each epoch. This is useful if the data comes from a time series with highly correlated successive trials, and is worth experimenting with to reach faster convergence. The main reason why it is disabled by default is that traditional SGD does not enable this by default.

    • verbose name: Shuffle Data
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • warm_start
    Start from previous solution. This allows for online updating of the model.

    • verbose name: Warm Start
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'neg_mean_squared_error' is usually a good default, some other metrics can be useful under some circumstances, for instance 'neg_mean_absolute_error', which penalizes large deviations less strongly than mse.

    • verbose name: Scoring Metric For Parameter Search
    • default value: neg_mean_squared_error
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • averaging
    Use averaging. This is an alternative way of performing stochastic gradient descent, which can lead to faster convergence. Can also be an integer greater than 1; in this case, this is the number of samples seen after which averaging kicks in.

    • verbose name: Averaging
    • default value: False
    • port type: Port
    • value type: object (can be None)
  • include_bias
    Include a bias term. If false, your features need to be centered, or include a dummy feature set to 1.

    • verbose name: Include Bias Term
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • epsilon
    Epsilon for epsilon-insensitive and Huber data terms. Note that this depends strongly on the scale of the data. Can be interpreted as the cutoff in data units beyond which data values are treated more robustly (i.e., as potential outliers).

    • verbose name: Epsilon
    • default value: 0.1
    • port type: FloatPort
    • value type: float (can be None)
  • learning_rate_schedule
    Schedule for adapting the learning rate. This is a highly technical setting that allows to implement different forms of stochastic gradient descent. The options correspond to the following rules for computing the learning rate (step size) eta of the method dependent on the current iteration number t, where 'constant' means eta=eta0, 'optimal' yields the formula eta=1.0/(t+t0), and 'invscaling' corresponds to the formula eta=eta0/pow(t,power_t). The main reason to change this would be to replicate some other published implementation.

    • verbose name: Learning Rate Schedule
    • default value: invscaling
    • port type: EnumPort
    • value type: str (can be None)
  • eta0
    Initial learning rate. This is the parameter eta0 in the learning rate schedule formulate.

    • verbose name: Initial Learning Rate (Step Size)
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • power_t
    Exponent in invscaling schedule. This is the exponent power_t in the invscaling learning rate schedule.

    • verbose name: Exponential Falloff (Invscaling Only)
    • default value: 0.25
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

SupportVectorClassification

Classification using support vector machines.

This is the traditional kernel support vector machine, which is very efficient at finding non-linear structure in the data. This method has two important regularization parameters (cost and gamma), which are by default searched over a relatively coarse interval. For best results you may have to use a finer range (at some performance cost). Note that it is typically not very reasonable to use a non-linear classifier like this on the channel signal without any sort of spatial filtering (such as CSP or ICA). Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.1.1

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • kernel
    Kernel type to use. This is a non-linear transform of the feature space, allowing for non-linear decision boundaries between classes. The different kernels are Linear kernel (a trivial do-nothing kernel), Polynomial, which yields components that are all possible polynomial combinations of input features up to the desired degree, Radial-Basis Functions, which is one of the most commonly-used non-linear kernels and the Sigmoid kernel.

    • verbose name: Kernel Type
    • default value: rbf
    • port type: EnumPort
    • value type: str (can be None)
  • feature_selection
    Feature selection criterion to use. If set to None, no feature selection is enabled. If set to anova, features with the highest ANOVA f-score are selected; this is a good option for simple kernels like linear SVM. If set to mi, features with highest mutual information are selected, which is better for highly non-linear kernels like rbf. In either case, the number of features is automatically selected using a cross-validation, and can be constrained using num_features.

    • verbose name: Feature Selection
    • default value: none
    • port type: EnumPort
    • value type: str (can be None)
  • probabilistic
    Use probabilistic outputs. If enabled, the node will output for each class the probability that a given trial is of that class; otherwise it will output the most likely class label.

    • verbose name: Output Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions). If set to -1, no limit is in effect.

    • verbose name: Maximum Number Of Iterations
    • default value: -1
    • port type: IntPort
    • value type: int (can be None)
  • cost
    SVM cost parameter. This value determines the degree to which solutions are penalized that would mis-classify data points. Higher values result in models that are less likely to mis-classify the training data, but at the expense of potentially worse generalization to new data (less margin for error when slightly different trials are encountered in future test data). This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A very small value means effectively no penalty, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Cost
    • default value: [0.01, 0.1, 1.0, 10.0, 100]
    • port type: Port
    • value type: list (can be None)
  • poly_degree
    Degree of the polynomial kernel. Ignored by other kernel types. This is the maximum degree of polynomial combinations of feature that are generated. This is also a list of possible values, and is searched in a grid search just like the cost parameter (see cost parameter for details on this procedure).

    • verbose name: Degree (Polynomial Kernel Only)
    • default value: [1, 2, 3]
    • port type: Port
    • value type: list (can be None)
  • gamma
    Gamma parameter of the RBF kernel. This parameter controls the scale of the kernel mapping, where lower scales can capture smaller-scale structure in the data. When left at the default, it resolves to 1 divided by the number of features. This is a list of possible values, and is searched in a grid search just like the cost parameter (see cost parameter for details on this procedure).

    • verbose name: Scale (Rbf Kernel Only)
    • default value: [0.0001, 0.001, 0.01, 0.1, 1.0, 10.0]
    • port type: Port
    • value type: list (can be None)
  • coef0
    Constant term in kernel function. Only used in polynomial and sigmoid kernels.

    • verbose name: Constant (Poly Or Sigmoid Kernels Only)
    • default value: [0.0, 1.0]
    • port type: Port
    • value type: list (can be None)
  • num_features
    How many features to select. Given as a list of possible values to search over alongside all other parameters. Can also be a single value, which is equivalent to the full range up to this number. If not given, defaults to max_feature_select or the number of features in the data otherwise.

    • verbose name: Num Features
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • max_feature_select
    Maximum number of features to consider. This is applied using ANOVA f-scores, and can be used to speed up the mutual information based feature-selection mode.

    • verbose name: Max Feature Select (Anova)
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • search_metric
    Parameter search metric. This method will run a cross-validation for each possible shrinkage parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. Therefore, the running time of the method is proportional to the number of parameter values times the number of folds in the cross-validation, which can be slow. While 'accuracy' is usually a good default, some other metrics can be useful under special circumstances, e.g., roc_auc for highly imbalanced ratios of trials from different classes.

    • verbose name: Scoring Metric For Parameter Search
    • default value: accuracy
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block. This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • shrinking
    Use shrinking heuristic.

    • verbose name: Shrinking
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • cache_size
    "Cache size in MB.

    • verbose name: Cache Size
    • default value: 200
    • port type: IntPort
    • value type: int (can be None)
  • random_seed
    Random seed. Different values may give slightly different outcomes.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a BakeDesignMatrix node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

SupportVectorRegression

Regression using support vector machines.

This method is very similar to Support Vector Classification, the difference being that the desired target output can be a continuous value (as opposed to a class label or probability) as with all regression methods. Like all machine learning methods, this method needs to be calibrated ("trained") before it can make any predictions on data. For this, the method requires training instances and associated training labels. The typical way to get such labels associated with time-series data is to make sure that a marker stream is included in the data, which is usually imported together with the data using one of the Import nodes, or received over the network alongside with the data, e.g., using the LSL Input node (with a non-empty marker query). These markers are then annotated with target labels using the Assign Targets node. To generate instances of training data for each of the training markers, one usually uses the Segmentation node to extract segments from the continuous time series around each marker. Since this machine learning method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • kernel
    Kernel type to use. This is a non-linear transform of the feature space, allowing for non-linear decision boundaries between classes. The different kernels are Linear kernel (a trivial do-nothing kernel), Polynomial, which yields components that are all possible polynomial combinations of input features up to the desired degree, Radial-Basis Functions, which is one of the most commonly-used non-linear kernels and the Sigmoid kernel.

    • verbose name: Kernel Type
    • default value: rbf
    • port type: EnumPort
    • value type: str (can be None)
  • cost
    SVM cost parameter. This value determines the degree to which solutions are penalized the outlier points. Higher values result in models that are less likely to be affected by the outliers in the training data, but at the expense of potentially worse generalization to new data (less margin for error when slightly different trials are encountered in future test data). This is a list of candidate values, the best of which is found via an exhaustive search (i.e., each value is tested one by one, and therefore the total running time is proportional to the number of values listed here). The details of the parameter search can be controlled via the search metric and number of folds parameters. Larger values cause stronger regularization, that is, less risk of the method over-fitting to random details of the data, and thus better generalization to new data. A very small value means effectively no penalty, and there is no upper limit to how large the values that can be given here may be -- however, depending on the scale of the data and the number of trials. Often one covers a range between 0.1 and 10, and at times 0.01 to 100. Typically the values here are not linearly spaced, but follow an exponential progression (e.g., 0.25, 0.5, 1, 2, 4, 8, ... etc). The default search range is intentionally coarse for quick running times; refine it to smaller steps to obtain potentially better solutions, but do not expect massive gains from refining.

    • verbose name: Cost
    • default value: [0.01, 0.1, 1.0, 10.0, 100]
    • port type: ListPort
    • value type: list (can be None)
  • poly_degree
    Degree of the polynomial kernel. Ignored by other kernel types. This is the maximum degree of polynomial combinations of feature that are generated. This is also a list of possible values, and is searched in a grid search just like the cost parameter (see cost parameter for details on this procedure).

    • verbose name: Degree (Polynomial Kernel Only)
    • default value: [1, 2, 3]
    • port type: ListPort
    • value type: list (can be None)
  • gamma
    Gamma parameter of the RBF kernel. This parameter controls the scale of the kernel mapping, where lower scales can capture smaller-scale structure in the data. When left at the default, it resolves to 1 divided by the number of features. This is a list of possible values, and is searched in a grid search just like the cost parameter (see cost parameter for details on this procedure).

    • verbose name: Scale (Rbf Kernel Only)
    • default value: [0.0001, 0.001, 0.01, 0.1, 1.0, 10.0]
    • port type: ListPort
    • value type: list (can be None)
  • coef0
    Constant term in kernel function. Only used in polynomial and sigmoid kernels.

    • verbose name: Constant (Poly Or Sigmoid Kernels Only)
    • default value: [0.0, 1.0]
    • port type: ListPort
    • value type: list (can be None)
  • epsilon
    Epsilon for epsilon-insensitive and Huber data terms. Note that this depends strongly on the scale of the data. Can be interpreted as the cutoff in data units beyond which data values are treated more robustly (i.e., as potential outliers).

    • verbose name: Epsilon
    • default value: 0.1
    • port type: FloatPort
    • value type: float (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.001
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions). If set to -1, no limit is in effect.

    • verbose name: Maximum Number Of Iterations
    • default value: -1
    • port type: IntPort
    • value type: int (can be None)
  • search_metric
    Parameter search metric. When the regularization parameter is given as a list of values, then the method will run a cross-validation for each possible parameter value and use this metric to score how well the method performs in each case, in order to select the best parameter. While 'neg_mean_squared_error' is usually a good default, some other metrics can be useful under some circumstances, for instance 'neg_mean_absolute_error', which penalizes large deviations less strongly than mse.

    • verbose name: Scoring Metric For Parameter Search
    • default value: neg_mean_squared_error
    • port type: EnumPort
    • value type: str (can be None)
  • num_folds
    Number of cross-validation folds for parameter search. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 5
    • port type: IntPort
    • value type: int (can be None)
  • cv_group_field
    Optionally a field indicating the group from which each trial is sourced. If given, then data will be split such that test sets contain unseen groups. Examples groups are SubjectID, SessionID, etc.

    • verbose name: Grouping Field (Cross-Validation)
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials with each label. Note that this requires labels to be quantized or binned to be meaningful.

    • verbose name: Stratified Cross-Validation
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • shrinking
    Use shrinking heuristic.

    • verbose name: Shrinking
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • cache_size
    Cache size in MB.

    • verbose name: Cache Size
    • default value: 200
    • port type: IntPort
    • value type: int (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

TrialAggregatePredictor

A model that makes aggregate predictions based on multiple trials.

The node accepts a wired-in single-trial machine-learning pipeline (via the method input), which is trained and tested on single trials, and which may output probabilities or other continuous-valued outputs. This node then aggregates the single-trial outputs into a single prediction for each group of trials, based on the provided group field and integration rule. The node furthermore uses probability calibration to ensure that the single-trial outputs are valid probabilities that accurately quantify the confidence at the trial level, which is not available from the raw single-trial predictions. Currently this node is only suitable for binary classification tasks. For technical reasons, probability calibration is usually not be performed on the same data that a classifier was trained on, since the classifier will generate confidence scores on its own training data that will not be representative of the classifier's outputs on new data. Therefore, the calibration generally uses a cross-validation approach to train the underlying classifier on different data than what is used to fit the calibration mapping (see also cross-validation node for full details on cross-validation is generally done).

Version 0.5.0

Ports/Properties

  • metadata
    User-definable meta-data associated with the node. Usually reserved for technical purposes.

    • verbose name: Metadata
    • default value: {}
    • port type: DictPort
    • value type: dict (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • method
    Single-trial scoring model.

    • verbose name: Method
    • default value: None
    • port type: GraphPort
    • value type: Graph
  • method__signature
    Argument names of an underlying single-trial pipeline. Your pipeline is a subgraph that must contain at least one Placeholder node whose slotname must match the argument name listed here. The placeholder then acts as the entry point for any data that is passed into the pipeline when it is invoked by the aggregation node. Your pipeline's final node (which typically produces the predictions) is then wired to the aggregation node's "method" input port. In graphical UIs, this edge will be displayed in dotted style to indicate that this is not normal forward data flow, but that a subgraph (your pipeline) runs under the control of the aggregation node. In summary, your pipeline starts with a Placeholder that is followed by some processing nodes (in the simplest case just a single machine-learning node, such as Linear Discriminant Analysis). The final node of your pipeline is the one whose outputs are taken to be the pipeline's predictions, and this node is wired into the "method" input of this node. As always in NeuroPype, any "loose ends" downstream of your placeholder are also considered to be part of the pipeline but do not contribute to the result (they may be used for other purposes, such as printing progress information). Your pipeline may optionally have a second placeholder, which should by convention have slotname set to is_training, and then is_training must be listed as the second argument here. This second placeholder is used to indicate whether your pipeline is currently being called on training data or test data. During training, the aggregation node will execute your pipeline like a cross-validation, meaning that, for each fold in the cross-validation, your pipeline graph is instantiated from its default (uninitialized) state, and is then called with the training set of that fold. Then, the same graph is called again, but this time with the test set of that fold; it is then up to any adaptive nodes in your pipeline (e.g., machine learning nodes) to adapt themselves on the first call and to make predictions (usually without adapting again) on the second call. The pipeline is discarded after each fold and a new pipeline graph is instantiated (to avoid any unintended train/test leakage).

    • verbose name: Method [Signature]
    • default value: (data)
    • port type: Port
    • value type: object (can be None)
  • group_field
    The name of the instance data field that contains the group identifiers. Trials with the same group identifier will be aggregated together at prediction time. It is assumed that within a group the condition is the same.

    • verbose name: Group Field
    • default value: SubjectID
    • port type: StringPort
    • value type: str (can be None)
  • cond_field
    The name of the instance data field that contains the conditions to be discriminated. This parameter will be ignored if the packet has previously been processed by a DescribeStatisticalDesign node.

    • verbose name: Cond Field
    • default value: TargetValue
    • port type: StringPort
    • value type: str (can be None)
  • class_weights
    Per-class weights. Optionally this is a mapping from class label to weight. The weights represent the a priori ("prior") probability of encountering a specific class that the model shall assume. The weights will be renormalized so that they add up to one. Example syntax: {'0': 0.5, '1': 0.5} (note the quotes before the colons).

    • verbose name: Per-Class Weight
    • default value: None
    • port type: Port
    • value type: object (can be None)
  • aggregate
    Whether to enable aggregation. If disabled, the single-trial responses of the underlying model will be passed through.

    • verbose name: Perform Aggregation
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • rule
    The rule used to integrate the single-trial responses; these mostly control a statistical efficiency vs robustness (to outlier trials) tradeoff. The default is to average the probabilities, but other methods are available that may be more robust to outliers or more sensitive to the tails of the distribution. The 'mean' and median rules are the arithmetic mean, and median, respectively. The 'huber', 'trim_mean' and 'winsorize' options are the Huber mean, trimmed mean, and winsorized mean, respectively. These use the outlier cutoff, which may need to be set appropriately.

    • verbose name: Rule
    • default value: trim_mean
    • port type: EnumPort
    • value type: str (can be None)
  • outlier_cutoff
    Outlier cutoff when aggregating robustly across trials. When using the trim_mean or winsorize rules, this is the proportion of trials to trim (default 0.1), and when using the huber rule, this is the cutoff point in (robust) standard deviations from the median (also known as the Huber constant); default 1.345.

    • verbose name: Outlier Cutoff
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • model_is_probabilistic
    Whether the underlying model outputs probabilities. If left to auto, the node will attempt to auto-detect this and if it fails, will report an error.

    • verbose name: Model Is Probabilistic
    • default value: auto
    • port type: EnumPort
    • value type: str (can be None)
  • max_probability
    Maximum probability value from the underlying model. This is used to clip the probabilities to avoid numerical instability when the model outputs very high probabilities. Only used if the model is probabilistic.

    • verbose name: Max Probability
    • default value: 0.99
    • port type: FloatPort
    • value type: float (can be None)
  • calibrate_probabilities
    Whether to perform probability calibration of the groupwise outputs. If this is not enabled, the generated probabilities will typically be wildly conservative, as if a single-trial prediction was made.

    • verbose name: Calibrate Probabilities
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • num_folds
    Number of cross-validation folds for out-of-sample calibration. Note that this cross-validation generally acts on the groups of trials, not on individual trials; the default is to leave one group out, but if there are many groups, this can be set to an explicit number of folds such as 5 or 10. Cross-validation proceeds by splitting up the data into this many blocks of trials, and then tests the method on each block. For each fold, the method is re-trained on all the other blocks, excluding the test block (therefore, the total running time is proportional to the number of folds). This is not a randomized cross-validation, but a blockwise cross-validation, which is usually the correct choice if the data stem from a time series. If there are few trials in the data, one can use a higher number here (e.g., 10) to ensure that more data is available for training. This can also be set to 0 if no cross-validation is desired, in which case the method and calibration mapping are trained on all the data, or to 1 to use a (perhaps costly) leave-one-out cross-validation.

    • verbose name: Number Of Cross-Validation Folds
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • cv_stratified
    Optionally perform stratified cross-validation. This means that all the folds have the same relative percentage of trials in each class.

    • verbose name: Stratified Cross-Validation
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • refit_bias
    Whether to refit the bias term in the calibration mapping. When this is disabled, only the confidence of the classifier is recalibrated, which is usually what is desired when switching from single trials to groupwise predictions. However, one may additionally rebias the classifier by enabling this option, but note that the number of data points available for this is typically much lower than the number of trials, so the bias is likely to be somewhat less accurately estimated than by the underlying single-trial model. Nevertheless, if the underlying model is biased in out-of-sample predictions, or very close to chance level, then this this may have to be enabled since otherwise the calibration may end up predicting the opposite of the underlying model (this can be spotted by enabling the sanity checks).

    • verbose name: Refit Bias Term
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • calibrator
    Type of classifier to use. Logreg is a simple logistic regression, and bayes-logreg is a Bayesian logistic regression. The latter has more sensible priors on the weights and can therefore be more robust, especially when the underlying single-trial predictor is very weak (e.g., close to chance level).

    • verbose name: Calibrator
    • default value: logreg
    • port type: EnumPort
    • value type: str (can be None)
  • sanity_checks
    Check the accuracy of the underlying model on the training data. If the model is worse than chance, a warning will be issued. This is useful to catch bugs in the underlying model.

    • verbose name: Sanity Checks
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • num_procs
    Number of processes to use for parallel computation. If None, the global setting NUM_PROC, which defaults to the number of CPUs on the system, will be used.

    • verbose name: Max Parallel Processes
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • num_threads_per_proc
    Number of threads to use for each process. This can be used to limit the number of threads by each process to mitigate potential churn.

    • verbose name: Threads Per Process
    • default value: 4
    • port type: IntPort
    • value type: int (can be None)
  • compute_backends
    GPU compute backends that may be used by the pipeline. If you include GPU compute backends here, workloads using those backends will be farmed out across multiple GPUs (if available) when running cross-validation folds in parallel. The 'auto' mode will attempt to auto-detect any backend settings in the given pipeline's nodes, but note that this will only catch nodes where this is explicit in the node's properties, and GPU workloads missed in this fashion will run by default on GPU 0.

    • verbose name: Compute Backends
    • default value: ['auto']
    • port type: SubsetPort
    • value type: list (can be None)
  • num_procs_per_gpu
    Number of processes to use per GPU. This is only relevant if you have GPU compute backends enabled. If your GPU(s) are under-utilized during cross-validation, you can increase this to run this many CV folds on each GPU.

    • verbose name: Processes Per Gpu
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • multiprocess_backend
    Backend to use for farming out computation across multiple (CPU) processes. Multiprocessing is the simple Python default, which is not a bad start. Nestable is a version of multiprocessing that allows your pipeline to itself use parallel computation. Loky is a fast and fairly stable backend, but it does not support nested parallelism and has different limitations than multiprocessing. It can be helpful to try either if you are running into an issue trying to run something in parallel. Serial means to not run things in parallel but instead in series (even if num_procs is >1), which can help with debugging. Threading uses Python threads in the same process, but this is not recommended for most use cases due to what is known as GIL contention.

    • verbose name: Multiprocess Backend
    • default value: serial
    • port type: EnumPort
    • value type: str (can be None)
  • serial_if_debugger
    If True, then if the Python debugger is detected, the node will run in serial mode, even if multiprocess_backend is set to something else. This is useful for debugging, since the debugger does not work well with parallel processes. This can be disabled if certain steps should nevertheless run in parallel (e.g., to reach a breakpoint more quickly).

    • verbose name: Serial If Debugger
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • initialize_once
    Calibrate the model only once. If set to False, then this node will recalibrate itself whenever a non-streaming data chunk is received that has both training labels and associated training instances.

    • verbose name: Calibrate Only Once
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • dont_reset_model
    Do not reset the model when the preceding graph is changed. Normally, when certain parameters of preceding nodes are being changed, the model will be reset. If this is enabled, the model will persist, but there is a chance that the model is incompatible when input data format to this node has changed.

    • verbose name: Do Not Reset Model
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • verbosity
    Verbosity level for diagnostics.

    • verbose name: Verbosity
    • default value: 2
    • port type: IntPort
    • value type: int (can be None)
  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)