matclustering.core package

Submodules

matclustering.core.AbstractTrajectoryClustering module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

class matclustering.core.AbstractTrajectoryClustering.HSTrajectoryClustering(name='NAME?', random_state=1, n_jobs=1, verbose=False)[source]

Bases: TrajectoryClustering

Class for hyperparameter search in multiple-aspect trajectory clustering algorithms.

This class extends the TrajectoryClustering class to include hyperparameter optimization (tuning through grid search) and model validation. It implements methods for training, testing, and saving models, along with detailed reporting of the results.

best_config

The best hyperparameter configuration found during training.

Type:

list

prepare_input(X):

Prepares the input data for training (to be implemented by subclasses).

if_config(config=None):

Returns the current configuration or default configuration.

train(dir_validation='.'):

Trains the model using grid search over hyperparameters and returns the training report.

test(rounds=1, dir_evaluation='.'):

Tests the best model on the dataset and returns evaluation metrics.

save(dir_path='.', modelfolder='model'):

Saves the model and its training/testing results to the specified directory.

training_report():

Returns the training report DataFrame.

testing_report():

Returns the testing report DataFrame.

abstract if_config(config=None)[source]
abstract prepare_input(X)[source]
save(dir_path='.', modelfolder='model')[source]
test(rounds=1, dir_evaluation='.')[source]
testing_report()[source]
train(dir_validation='.')[source]
training_report()[source]
class matclustering.core.AbstractTrajectoryClustering.TrajectoryClustering(name='NAME?', random_state=1, n_jobs=1, verbose=False)[source]

Bases: ABC

Abstract base class for trajectory clustering algorithms.

This class provides a framework for clustering multiple-aspect trajectory data. It allows the configuration of clustering parameters, performs hyperparameter combinations for grid search optimization (see HSTrajectoryClustering class), and evaluates clustering results using various metrics.

name

Name of the clustering model.

Type:

str

isverbose

Flag indicating whether to print verbose output.

Type:

bool

save_results

Flag to indicate if results should be saved.

Type:

bool

config

Configuration dictionary to hold hyperparameters and settings.

Type:

dict

model

The clustering model instance.

Type:

object

report

Report of clustering evaluation metrics.

Type:

DataFrame

test_report

Report of clustering evaluation metrics on test data.

Type:

DataFrame

add_config(**kwargs):

Updates the configuration dictionary with new parameters.

grid_search(*args):

Generates combinations of hyperparameters for grid search.

duration():

Returns the elapsed time since the model was initialized in milliseconds.

clear():

Clears the current model instance.

message(pbar, text):

Displays a message during model training/testing.

prepare_input(X, metric=None, dataset_descriptor=None):

Prepares the input data for clustering (to be implemented by subclasses).

create(config=None):

Creates and returns the clustering model instance (to be implemented by subclasses).

score(y_test, y_pred, X=None):

Calculates and returns various clustering evaluation metrics as a DataFrame.

summary():

Returns a summary of the clustering results.

fit(X, config=None):

Fits the clustering model to the input data and returns the report and cluster labels.

save(dir_path='.', modelfolder='model'):

Saves the clustering model and its results to the specified directory.

cluestering_report():

Generates a DataFrame of cluster predictions and labels.

add_config(**kwargs)[source]
clear()[source]
cluestering_report()[source]
abstract create(config=None)[source]
duration()[source]
fit(X, config=None)[source]
message(pbar, text)[source]
abstract prepare_input(X, metric=None, dataset_descriptor=None)[source]
save(dir_path='.', modelfolder='model')[source]
score(y_test, y_pred, X=None)[source]
summary()[source]

matclustering.core.SimilarityClustering module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

class matclustering.core.SimilarityClustering.SimilarityClustering(name, random_state=1, n_jobs=1, verbose=False)[source]

Bases: HSTrajectoryClustering

Similarity-based clustering for multiple-aspect trajectory data.

This class extends the HSTrajectoryClustering class to provide clustering functionality based on similarity metrics for trajectory data. It includes methods to prepare input data and compute similarity matrices using various metrics.

name

Name of the clustering model.

Type:

str

metric

Similarity metric used for clustering.

Type:

object

X

Similarity matrix of the input trajectories.

Type:

array-like

labels

List of labels associated with each trajectory in the input data.

Type:

list

default_metric(dataset_descriptor):

Initializes and returns the default similarity metric (MUITAS) for the dataset.

prepare_input(X, metric=None, dataset_descriptor=None):

Prepares the input data by converting it to trajectories and calculating the similarity matrix.

default_metric(dataset_descriptor)[source]
prepare_input(X, metric=None, dataset_descriptor=None)[source]

Module contents