Module: feature_extraction

Feature extraction algorithms and related nodes.

These nodes extract certain kinds of features from data; most nodes in this category expect inbound chunks to have an instance axis, and the output will usually have a feature and an instance axis; some nodes are adaptive and will adapt themselves only when a chunk is flagged as not streaming. Some of these nodes overlap with supervised machine learning in that they perform processing that is informed by target labels.

CCA

Perform Canonical Correlation Analysis (CCA) and calculate matrices of scores for CCA.

The inputs of this node are two streams (X, Y) with data the same number of points in time (N). X and Y may have different shapes. The outputs of this node are three chunks with data values of three matrices A, B, and D, respectively. The columns of A and B contain the canonical coefficients. D is a vector with sample canonical correlation values.

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • X
    Data X to process.

    • verbose name: X
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: IN
  • Y
    Data Y to process.

    • verbose name: Y
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: IN
  • A
    Canonical coefficients corresponding to X.

    • verbose name: A
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • B
    Canonical coefficients corresponding to Y.

    • verbose name: B
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT
  • D
    Sample canonical correlation values.

    • verbose name: D
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: OUT

CollapseToFeatures

Collapse all non-instance axes of a stream into a single feature axis.

This can be used to get the data into a simple unified format that most machine learning methods can handle. Specifically, this node takes a chunk, with any combination of axes (e.g., time, space, frequency, instance) and returns a chunk that has only a flat feature axis (subsuming all original axes) and an instance axis. Chunks that have no instance axis are left unchanged. Technically, if a chunk has multiple instance axes, all but the first such axis are being collapsed into the feature axis, as well.

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT

DictionaryLearning

Find a sparse representation of the data using Dictionary Learning.

This method will learn components, that is linear combinations of input features, such that the transformed data has feature whose activations are maximally sparse. This is also known as sparse coding, and is closely related to Independent Component Analysis (ICA). As such, it can be used for identifying statistically independent or otherwise meaningful and/or interpretable features. Therefore, these are features that can be highly useful in subsequent processing stages, for instance non-linear feature extraction or sparse machine learning techniques. In contrast to PCA or ICA, this method can easily learn more features than there were data dimensions in the input data. Also, unlike most other feature extractors, this method will attempt to estimate the the transformed data in accordance with the sparse modeling assumption, instead of simply linearly transforming it (at some extra computational cost). This node offers multiple algorithms both for estimating the model, and then for transforming (reconstructing) the output data given the model. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • alpha
    Degree of sparsity. Higher values yield components with more sparse activity.

    • verbose name: Degree Of Sparsity
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • transform_alpha
    Degree of sparsity during transformation. Does not apply to 'lars' case.

    • verbose name: Degree Of Sparsity Of Transformed Data
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • transform_nonzeroes
    Targeted number of non-zeroes per column in the solution. Only used by the lars method.

    • verbose name: Transform Nonzeroes
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 1e-08
    • port type: FloatPort
    • value type: float (can be None)
  • fit_algorithm
    Method used during fitting. The lars method is faster than coordinate descent if the components are sparse.

    • verbose name: Fit Algorithm
    • default value: lars
    • port type: EnumPort
    • value type: object (can be None)
  • transform_algorithm
    Method used during transform. lasso_lars is fastest when components are sparse.

    • verbose name: Transform Algorithm
    • default value: lasso_lars
    • port type: EnumPort
    • value type: object (can be None)
  • split_sign
    Split the sparse feature vector into the concatenation of its negative and positive part.

    • verbose name: Split Sign
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

ExtractComponentWeights

Some decomposition nodes save the transformation matrix(ices) in the meta-data properties properties.

This node converts those properties into a data block which can then be visualized by other nodes.

Version 0.1.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • get_inverse
    For visualizing component weights on the scalp, you probably want the inverse of the weights matrix.

    • verbose name: Get Inverse
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • reverse_polarity
    Component weights signs are arbitrary. Here you can flip the sign of all weights. TODO: Individual flipping.

    • verbose name: Reverse Polarity
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)

FactorAnalysis

Perform Factor Analysis on the given data (FA).

This method extracts latent components from the data that better explain the variability than the original channels or features. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • noise_variance_init
    Optional initial noise variance estimates per feature.

    • verbose name: Initial Noise Variance
    • default value: None
    • port type: ListPort
    • value type: list (can be None)
  • svd_method
    SVD method to use. Lapack can achieve higher accuracy, but randomized is quite a bit faster.

    • verbose name: Svd Method
    • default value: randomized
    • port type: EnumPort
    • value type: object (can be None)
  • iterated_power
    Number of iterations for the power method. Only used in randomized mode.

    • verbose name: Iterated Power
    • default value: 3
    • port type: IntPort
    • value type: int (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)

FastICA

Independent component analysis using the FastICA method.

ICA will produce components, that is linear combinations of input features, such that the values in the different output features are maximally statistically independent from each other (that is, the samples in of the dimensions share little information with samples in another feature. In contrast to PCA, this method is usually not used for dimensionality reduction, but for identifying independent features in the data. Such features can be assumed to be more interpretable than other projections, since they are likely to relate to independent underlying processes that generated the data. As such, these are features that can be highly useful in subsequent processing stages, for instance non-linear feature extraction or sparse machine learning techniques. One weakness of ICA is that it is quite sensitive to outliers in the data (and to a lesser extent to noise sources), resulting in brittle or bad solutions when the input data is not already reasonably clean. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 750
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • algorithm
    Optimization approach. For highly noisy data the parallel approach is usually better, but can be computationally somewhat more costly. Note that this does not mean that multiple cores are being used, but instead multiple components are optimized simultaneously.

    • verbose name: Algorithm
    • default value: parallel
    • port type: EnumPort
    • value type: object (can be None)
  • whiten
    Pre-whiten the data. If disabled, the data must have been whitened otherwise before.

    • verbose name: Pre-Whiten
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • fun
    Non-linearity to use. The logcosh function corresponds to Infomax ICA, which yields the best quality, but other non-linearities, particularly cube, are faster to calculate.

    • verbose name: Non-Linearity To Use
    • default value: logcosh
    • port type: EnumPort
    • value type: object (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

IncrementalPrincipalComponentAnalysis

Reduce dimensionality using Incremental Principal Component Analysis (PCA).

PCA is one of the most commonly-used dimensionality reduction techniques, and will produce components, that is linear combinations of input features, such that the first component captures the direction of largest variance in the data, the second component the next-largest (orthogonal) direction, and so on. Note that, since PCA is not aware of any "labels" of the data, and as such only performs what is known as unsupervised learning, there is no guarantee that PCA will not remove data dimensions that would have been informative about those labels, that is, useful to a subsequent supervised learning method. Nevertheless, dimensionality reduction can greatly speed up subsequent data processing, or make it tractable in the first place. Also, since the components are sorted by the amount of variance in the data that they explain, in some settings they may yield individually interpretable or otherwise meaningful features. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In contrast to basic PCA, this node can update itself incrementally on streaming data as well as in one shot on offline data. When applying this node on streaming data, keep in mind that, as the node keeps updating its model, the output space will keep changing, especially in the beginning on the first few data points. If you use subsequent processing that assumes that the data space remains fixed, that can cause problems. The most common 'abuse' would be to follow this node by any node that buffers calibration data for a period of time and then does a one-shot calibration of some state. A better setup would be instead to perform the buffering prior to the incremental PCA node, or to use a regular static PCA, to avoid changing the data 'under the feet' of some other adaptive node. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • whiten
    Normalize (whiten) outputs. This will decorrelate the features.

    • verbose name: Whiten
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • batch_size
    Batch size. Optionally the number of samples to use for each mini-batch. This is a tradeoff between high performance if enough data is available during a given update (using larger batches) and low memory use (using smaller batches).

    • verbose name: Batch Size
    • default value: None
    • port type: IntPort
    • value type: int (can be None)

KernelPrincipalComponentAnalysis

Reduce dimensionality using Kernel Principal Component Analysis (Kernel PCA).

Kernel PCA is an advanced non-linear dimensionality reduction technique, which produces components that are non-linear combinations of the input features. As such, it can reveal dominant non-linear structure in the input data, and can also serve as pre-processing for linear machine learning techniques. Note that, since this method is not aware of any "labels" of the data, and as such only performs what is known as unsupervised learning, there is no guarantee that the method will not remove data dimensions that would have been informative about those labels, that is, useful to a subsequent supervised learning method. The components produced by Kernel PCA can, given the right data, produce interpretable or otherwise meaningful features, which can enable subsequent machine learning methods to make good use of them. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all non-zero components are kept.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • kernel
    Kernel type to use. This is a non-linear transform of the feature space, allowing for non-linear dimensionality reduction. The different kernels are Linear kernel (a trivial do-nothing kernel), Polynomial, which yields components that are all possible polynomial combinations of input features up to the desired degree, Radial-Basis Functions, which is one of the most commonly-used non-linear kernels, the Sigmoid kernel, and the Cosine kernel.

    • verbose name: Kernel
    • default value: rbf
    • port type: EnumPort
    • value type: object (can be None)
  • poly_degree
    Degree of the polynomial kernel. Ignored by other kernel types. This is the maximum degree of polynomial combinations of feature that are generated.

    • verbose name: Degree (Polynomial Kernel Only)
    • default value: 3
    • port type: IntPort
    • value type: int (can be None)
  • gamma
    Gamma parameter of the RBF kernel. This ' parameter controls the scale of the kernel mapping, where lower scales can capture smaller-scale structure in the data. When left at the default, it resolves to 1 divided by the number of features.

    • verbose name: Scale (Rbf Kernel Only)
    • default value: None
    • port type: FloatPort
    • value type: float (can be None)
  • coef0
    Constant term in kernel function. Only used in polynomial and sigmoid kernels.

    • verbose name: Constant (Poly Or Sigmoid Kernels Only)
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • remove_zero_eig
    Remove zero-variance components. This can result in less than the desired number of components being returned, if the scale of the data along these projections is zero.

    • verbose name: Prune Zero Components
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)

NonNegativeMatrixFactorization

Decompose non-negative data using non-negative matrix factorization (NNMF).

NNMF is usually applied to data that is strictly positive, with the assumption that the data can be separated into additive (super-imposed) components that are themselves also positive. This holds, for instance, for frequency spectra of e.g., music. NNMF will produce components, that is linear combinations of input features, that can satisfy different criteria, such as sparsity. NNMF can be used to separate source in non-negative data, or identify sparse components, similar to ICA or sparse PCA on signed data. Given the right data, such features can be more interpretable or otherwise meaningful than other projections. As such, these are features that can be useful in subsequent processing stages, for instance additional feature transformations or machine learning techniques. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 200
    • port type: IntPort
    • value type: int (can be None)
  • max_iter_nls
    "Maximum number of iterations in NLS sub-problem. This controls a stopping criterion for the inner loop of the solver.

    • verbose name: Max Iter Nls
    • default value: 2000
    • port type: IntPort
    • value type: int (can be None)
  • sparseness
    Type of sparsity. This determines where to enforce sparsity in the model. Components means that the components will use only a few of the input features, data means that the data values themselves are sparse (have a tendency to be zero), and non disables sparsity.

    • verbose name: Sparseness
    • default value: none
    • port type: EnumPort
    • value type: object (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • beta
    Degree of sparsity. Only applies if the type of sparsity is not set to none.

    • verbose name: Degree Of Sparsity (If Sparse)
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • eta
    Degree of correctness to maintain. Applies if the degree of sparsity is not None. Smaller values allow for larger error.

    • verbose name: Degree Of Correctness (If Sparse)
    • default value: 0.1
    • port type: FloatPort
    • value type: float (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • init
    Method to initialize the procedure. nndsvd stands for non-negative double singular value decomposition, 'a' stands for the average of the data, and 'r' stands for small random values.

    • verbose name: Initialization Type
    • default value: auto
    • port type: EnumPort
    • value type: object (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

OnlineDictionaryLearning

Find a sparse representation of the data using Online Dictionary Learning.

This method will learn components, that is linear combinations of input features, such that the transformed data has feature whose activations are maximally sparse. This is also known as sparse coding, and is closely related to Independent Component Analysis (ICA). As such, it can be used for identifying statistically independent or otherwise meaningful and/or interpretable features. Therefore, these are features that can be highly useful in subsequent processing stages, for instance non-linear feature extraction or sparse machine learning techniques. In contrast to PCA or ICA, this method can easily learn more features than there were data dimensions in the input data. Also, unlike most other feature extractors, this method will attempt to estimate the the transformed data in accordance with the sparse modeling assumption, instead of simply linearly transforming it (at some extra computational cost). This node offers multiple algorithms both for estimating the model, and then for transforming (reconstructing) the output data given the model. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In contrast to regular DL, this node can update itself incrementally on streaming data as well as in one shot on offline data. When applying this node on streaming data, keep in mind that, as the node keeps updating its model, the output space will keep changing, especially in the beginning on the first few data points. If you use subsequent processing that assumes that the data space remains fixed, that can cause problems. The most common 'abuse' would be to follow this node by any node that buffers calibration data for a period of time and then does a one-shot calibration of some state. A better setup would be instead to perform the buffering prior to the Online DL node, or to use regular DL, to avoid changing the data 'under the feet' of some other adaptive node. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • alpha
    Degree of sparsity. Higher values yield components with more sparse activity.

    • verbose name: Degree Of Sparsity
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • transform_alpha
    Degree of sparsity during transformation. Does not apply to 'lars' case.

    • verbose name: Degree Of Sparsity Of Transformed Data
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • batch_size
    Number of samples in each mini-batch.

    • verbose name: Batch Size
    • default value: 3
    • port type: IntPort
    • value type: int (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • shuffle
    Shuffle samples before forming batches. This can improve the convergence rate.

    • verbose name: Shuffle
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • transform_nonzeroes
    Targeted number of non-zeroes per column in the solution. Only used by the lars method.

    • verbose name: Transform Nonzeroes
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • fit_algorithm
    Method used during fitting. The lars method is faster than coordinate descent if the components are sparse.

    • verbose name: Fit Algorithm
    • default value: lars
    • port type: EnumPort
    • value type: object (can be None)
  • transform_algorithm
    Method used during transform. lasso_lars is fastest when components are sparse.

    • verbose name: Transform Algorithm
    • default value: lasso_lars
    • port type: EnumPort
    • value type: object (can be None)
  • split_sign
    Split the sparse feature vector into the concatenation of its negative and positive part.

    • verbose name: Split Sign
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

PolynomialKernel

Generate polynomial combinations of features.

This node will compute all polynomial combinations between the values in the given data up to the given degree. For instance, if you have two values a and b in the input data and you use degree 3, this will output the values a, b, aa, ab, bb, aaa, aab, abb, bb*b. Since this is a non-linear feature extractor, using a linear classifier (or regression method) on these features will result in a classification rule that is non-linear in the original data (for instance, as with polynomial-kernel Support Vector Machines). This is most useful when combined with classifiers for which there is not already a polynomial mode built in to the classifier, since using such a mode can be more efficient than doing it in two steps. Also note that the number of output features will be combinatorially larger than the number of values in the input data, especially for higher degrees -- as a result, using this node can easily become intractable or exceed the amount of free memory. At the very least, the large number of features needs to be counter-acted by strong regularization in the classifier, such as, for instance, sparsity, since otherwise the classifier will likely overfit to accidentally correlated features. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • degree
    Polynomial degree. Compute polynomial features up to this degree.

    • verbose name: Polynomial Degree
    • default value: 2
    • port type: IntPort
    • value type: int (can be None)
  • interaction_only
    Generate only interaction terms. If enabled, univariate powers of each feature (e.g., x^3) are not included.

    • verbose name: Generate Only Interaction Terms
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • include_bias
    Include a bias feature (constant 1). This should only be done when the subsequent machine learning node (if any) does not already include a bias term -- if it does, adding another one can result in a broken model.

    • verbose name: Include Bias Term
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)

PrincipalComponentAnalysis

Reduce dimensionality using Principal Component Analysis (PCA).

PCA is one of the most commonly-used dimensionality reduction techniques, and will produce components, that is linear combinations of input features, such that the first component captures the direction of largest variance in the data, the second component the next-largest (orthogonal) direction, and so on. Note that, since PCA is not aware of any "labels" of the data, and as such only performs what is known as unsupervised learning, there is no guarantee that PCA will not remove data dimensions that would have been informative about those labels, that is, useful to a subsequent supervised learning method. Nevertheless, dimensionality reduction can greatly speed up subsequent data processing, or make it tractable in the first place. Also, since the components are sorted by the amount of variance in the data that they explain, in some settings they may yield individually interpretable or otherwise meaningful features. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • whiten
    Normalize (whiten) outputs. This will decorrelate the features.

    • verbose name: Whiten
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)

RandomProjections

Generate features from random linear projections of the input data.

This node will essentially generate a random matrix, and transform the data by that matrix. Typically, this is used either to reduce the dimensionality of the data, or to generate novel combinations of features when used with subsequent non-linear transformations. By default this node will auto-determine the number of components (and thus the number of output features) from the data. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to generate. If left to the default, the number of components will be determined based on the input data. This can be fairly conservative, i.e., the number of output components may be larger than strictly necessary to represent the input well enough.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • epsilon
    Embedding quality. Only used when the number of components is auto-deduced. Smaller values yield higher-quality embeddings and higher-dimensional results.

    • verbose name: Epsilon
    • default value: 0.1
    • port type: FloatPort
    • value type: float (can be None)
  • random_seed
    Random seed. Different values yield different projections.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

SparsePrincipalComponentAnalysis

Reduce dimensionality using Sparse Principal Component Analysis (Sparse PCA).

Basic PCA is one of the most commonly-used dimensionality reduction techniques, and will produce components, that is linear combinations of input features, such that the first component captures the direction of largest variance in the data, the second component the next-largest (orthogonal) direction, and so on. The difference in sparse PCA is that these components will in addition only be based on a small ("sparse") subset of the input features, rather than combining all features together like PCA does. The tradeoff between sparsity versus best capturing the direction of maximum variance in the data can be adjusted using the sparsity parameter. Note that, since this method is not aware of any "labels" of the data, and as such only performs what is known as unsupervised learning, there is no guarantee that the method will not remove data dimensions that would have been informative about those labels, that is, useful to a subsequent supervised learning method. Nevertheless, dimensionality reduction can greatly speed up subsequent data processing, or make it tractable in the first place. Also, the components produced by Sparse PCA can, given the right data, produce interpretable or otherwise meaningful features, which can enable subsequent machine learning methods to make good use of them. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to keep. If left unspecified, all components are kept, that is, the number of output features will correspond to the number of input dimensions.

    • verbose name: Number Of Components
    • default value: None
    • port type: IntPort
    • value type: int (can be None)
  • alpha
    Degree of sparsity. Higher values yield components with more sparse support.

    • verbose name: Degree Of Sparsity
    • default value: 1.0
    • port type: FloatPort
    • value type: float (can be None)
  • max_iter
    Maximum number of iterations. This is one of the stopping criteria to limit the compute time. The default is usually fine, and gains from increasing the number of iterations will be minimal (it can be worth experimenting with lower iteration numbers if the algorithm must finish in a fixed time budget, at a cost of potentially less accurate solutions).

    • verbose name: Maximum Number Of Iterations
    • default value: 1000
    • port type: IntPort
    • value type: int (can be None)
  • num_jobs
    Number of parallel compute jobs. This value only affects the running time and not the results. Values between 1 and twice the number of CPU cores make sense to expedite computation, but may temporarily reduce the responsiveness of the machine. The value of -1 stands for all available CPU cores.

    • verbose name: Number Of Parallel Jobs
    • default value: 1
    • port type: IntPort
    • value type: int (can be None)
  • verbosity
    Verbosity level. Higher numbers will produce more extensive diagnostic output.

    • verbose name: Verbosity Level
    • default value: 0
    • port type: IntPort
    • value type: int (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)
  • ridge_alpha
    Shrinkage parameter to improve transform conditioning.

    • verbose name: Ridge Alpha
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • tolerance
    Convergence tolerance. This is the desired errors tolerance or acceptable inaccuracy in the solution. Using larger values gives less accurate results, but will lead to faster compute times. Note that, for biosignal-driven machine learning systems, one often does not need very small tolerances.

    • verbose name: Tolerance
    • default value: 0.0001
    • port type: FloatPort
    • value type: float (can be None)
  • method
    Method for optimization. The lars method is faster than coordinate descent if the components are sparse.

    • verbose name: Method
    • default value: lars
    • port type: EnumPort
    • value type: object (can be None)
  • random_seed
    Random seed. Different values may yield slightly different results.

    • verbose name: Random Seed
    • default value: 12345
    • port type: IntPort
    • value type: int (can be None)

TensorDecomposition

Decompose a tensor into a number of rank-1 tensors.

This is analagous to a tensor version of matrix SVD. The algorithm used is the canonical polyadic decomposition (aka PARAFAC) via alternating least-squares. For example, a tensor with dimensions N neurons, T time points, and I instances (trials) can be decomposed into R latent factors, where each factor is rank-1 and can be thought of as a linear combination of neurons, timepoints, and trials. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • num_components
    Number of components to include in the model.

    • verbose name: Number Of Components
    • default value: 3
    • port type: IntPort
    • value type: int (can be None)
  • tol
    Optimization convergence criterion.

    • verbose name: Tol
    • default value: 1e-06
    • port type: FloatPort
    • value type: float (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: ["time", "instance"]). This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes. In this Node, these are the axes to preserve in the output. Use ["none"] if no axes are to be preserved.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: ['instance']
    • port type: ListPort
    • value type: list (can be None)

WhiteningTransform

Whiten (decorrelate and normalize) the given data without rotation.

A whitening transformation can be used to remove correlations between features. In addition, the features will also be normalized to unit variance, and optionally centered to zero mean. Whitening is a comprehensive type of normalization that can be used as pre-processing for a wide variety of subsequent processing stages. Important:This node is adaptive to the data, that is, it will learn a transformation of the data that depends on the input data. In order to learn this transformation, the node requires a reasonable amount of input data for calibration or "training" (otherwise it will yield an ill-fitting or noisy model). Since this feature extraction method is not capable of being trained incrementally on streaming data, the method requires a data packet that contains the entire training data; this training data packet can either be accumulated online and then released in one shot using the Accumulate Calibration Data node, or it can be imported from a separate calibration recording and then spliced into the processing pipeline using the Inject Calibration Data, where it passes through the same nodes as the regular data until it reaches the machine learning node, where it is used for calibration. Once this node is calibrated, the trainable state of this node can be saved to a model file and later loaded for continued use. Like most other feature extraction nodes, this node can compute features between elements of an axis of your choice while treating elements of another axis as the observations, trials, or samples. It can also optionally compute multiple separate models on different slices of the data along some axis of choice. It is also possible to pool multiple axes for any of these roles.

More Info...

Version 1.0.0

Ports/Properties

  • set_breakpoint
    Set a breakpoint on this node. If this is enabled, your debugger (if one is attached) will trigger a breakpoint.

    • verbose name: Set Breakpoint (Debug Only)
    • default value: False
    • port type: BoolPort
    • value type: bool (can be None)
  • data
    Data to process.

    • verbose name: Data
    • default value: None
    • port type: DataPort
    • value type: Packet (can be None)
    • data direction: INOUT
  • shrinkage
    Regularization strength. This is primarily to prevent degenerate solutions.

    • verbose name: Regularization Strength
    • default value: 0.01
    • port type: FloatPort
    • value type: float (can be None)
  • center
    Center data before whitening. This will remove the mean.

    • verbose name: Center Data
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • decorrelate
    Decorrelate the data. If disabled, only normalization will be performed.

    • verbose name: Decorrelate Data
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • retain_axes
    Retain original axes. If disabled, the domain axes will be replaced by a feature axis.

    • verbose name: Retain Axes
    • default value: True
    • port type: BoolPort
    • value type: bool (can be None)
  • domain_axes
    Axes which form the input domain of the transformation. Features are computed between elements along these axes (or in other words, elements along these axes will be combined with each other to yield features). This is a comma-separated list of axis names (for example: "space, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. For time-series data, this is usually the space axis, and if features have already been extracted from the data through some other method, it would be the features axis. In rare cases it can also include other axes, such as frequency, lag, and time.

    • verbose name: Compute Features Between Axes
    • default value: (all others)
    • port type: StringPort
    • value type: str (can be None)
  • aggregate_axes
    Axes to aggregate statistics over. The elements along these axes are treated as the "trials", "samples", or, equivalently, "observations". Adaptive feature extractors will compute statistics along the elements of these axes. This is a comma-separated list of axis names (for example: "time, instance"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes. This is almost always the instance axis (especially if the data has already been segmented, i.e., if the Segmentation node was used), but in some cases it may also be the time axis, or occasionally other axes.

    • verbose name: Treat Elements As Trials/samples Along Axes
    • default value: instance
    • port type: StringPort
    • value type: str (can be None)
  • separate_axes
    Axes along which to learn separate models. It is possible to use multiple separate feature-extraction models, each of which operates on a different slice of the data. This node does not combine data between elements along these axes in any way (although features between these elements may of course be combined in later stages, for instance in a classifier node). This is a comma-separated list of axis names (for example: "time, frequency"), possibly empty, or the special string "(all others)", which stands for all axes that are not listed in the other two lists of axes.

    • verbose name: Compute Separate Models Along Axes
    • default value:
    • port type: StringPort
    • value type: str (can be None)