matclustering.core package
Submodules
matclustering.core.AbstractTrajectoryClustering module
MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining
The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)
Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)
- Authors:
Tarlis Portela
- class matclustering.core.AbstractTrajectoryClustering.HSTrajectoryClustering(name='NAME?', random_state=1, n_jobs=1, verbose=False)[source]
Bases:
TrajectoryClustering
Class for hyperparameter search in multiple-aspect trajectory clustering algorithms.
This class extends the TrajectoryClustering class to include hyperparameter optimization (tuning through grid search) and model validation. It implements methods for training, testing, and saving models, along with detailed reporting of the results.
- best_config
The best hyperparameter configuration found during training.
- Type:
list
- prepare_input(X):
Prepares the input data for training (to be implemented by subclasses).
- if_config(config=None):
Returns the current configuration or default configuration.
- train(dir_validation='.'):
Trains the model using grid search over hyperparameters and returns the training report.
- test(rounds=1, dir_evaluation='.'):
Tests the best model on the dataset and returns evaluation metrics.
- save(dir_path='.', modelfolder='model'):
Saves the model and its training/testing results to the specified directory.
- training_report():
Returns the training report DataFrame.
- testing_report():
Returns the testing report DataFrame.
- class matclustering.core.AbstractTrajectoryClustering.TrajectoryClustering(name='NAME?', random_state=1, n_jobs=1, verbose=False)[source]
Bases:
ABC
Abstract base class for trajectory clustering algorithms.
This class provides a framework for clustering multiple-aspect trajectory data. It allows the configuration of clustering parameters, performs hyperparameter combinations for grid search optimization (see HSTrajectoryClustering class), and evaluates clustering results using various metrics.
- name
Name of the clustering model.
- Type:
str
- isverbose
Flag indicating whether to print verbose output.
- Type:
bool
- save_results
Flag to indicate if results should be saved.
- Type:
bool
- config
Configuration dictionary to hold hyperparameters and settings.
- Type:
dict
- model
The clustering model instance.
- Type:
object
- report
Report of clustering evaluation metrics.
- Type:
DataFrame
- test_report
Report of clustering evaluation metrics on test data.
- Type:
DataFrame
- add_config(**kwargs):
Updates the configuration dictionary with new parameters.
- grid_search(*args):
Generates combinations of hyperparameters for grid search.
- duration():
Returns the elapsed time since the model was initialized in milliseconds.
- clear():
Clears the current model instance.
- message(pbar, text):
Displays a message during model training/testing.
- prepare_input(X, metric=None, dataset_descriptor=None):
Prepares the input data for clustering (to be implemented by subclasses).
- create(config=None):
Creates and returns the clustering model instance (to be implemented by subclasses).
- score(y_test, y_pred, X=None):
Calculates and returns various clustering evaluation metrics as a DataFrame.
- summary():
Returns a summary of the clustering results.
- fit(X, config=None):
Fits the clustering model to the input data and returns the report and cluster labels.
- save(dir_path='.', modelfolder='model'):
Saves the clustering model and its results to the specified directory.
- cluestering_report():
Generates a DataFrame of cluster predictions and labels.
matclustering.core.SimilarityClustering module
MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining
The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)
Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)
- Authors:
Tarlis Portela
- class matclustering.core.SimilarityClustering.SimilarityClustering(name, random_state=1, n_jobs=1, verbose=False)[source]
Bases:
HSTrajectoryClustering
Similarity-based clustering for multiple-aspect trajectory data.
This class extends the HSTrajectoryClustering class to provide clustering functionality based on similarity metrics for trajectory data. It includes methods to prepare input data and compute similarity matrices using various metrics.
- name
Name of the clustering model.
- Type:
str
- metric
Similarity metric used for clustering.
- Type:
object
- X
Similarity matrix of the input trajectories.
- Type:
array-like
- labels
List of labels associated with each trajectory in the input data.
- Type:
list
- default_metric(dataset_descriptor):
Initializes and returns the default similarity metric (MUITAS) for the dataset.
- prepare_input(X, metric=None, dataset_descriptor=None):
Prepares the input data by converting it to trajectories and calculating the similarity matrix.