matclustering.methods.similarity package

Submodules

matclustering.methods.similarity.TSAgglomerative module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSAgglomerative.TSAgglomerative(k=5, linkage='single', random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

Hierarchical Agglomerative Clustering for trajectory data using a similarity matrix.

Parameters:
  • k (int, default=5) – The number of clusters to find.

  • linkage (str, default='single') – The linkage criterion to use, must be one of [‘single’, ‘complete’, ‘average’]. - ‘single’: minimizes the distance between the closest elements of the clusters. - ‘complete’: maximizes the distance between the furthest elements of the clusters. - ‘average’: uses the average of the distances of each point in one cluster to every point in the other cluster.

  • random_state (int, default=1) – Seed for reproducibility.

  • n_jobs (int, default=1) – Number of parallel jobs to run. Default is 1 (no parallelism).

  • verbose (bool, default=False) – If True, enables verbose output during the clustering process.

create(config=None)[source]

Creates an instance of the AgglomerativeClustering model using the provided configuration.

create(config=None)[source]
if_config(config=None)[source]

matclustering.methods.similarity.TSBirch module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSBirch.TSBirch(k=None, random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

BIRCH Clustering for trajectory data using a similarity matrix.

Parameters:
  • k (int or list of int, optional) – The number of clusters to find. Can be a single value or a list of values for grid search.

  • random_state (int, default=1) – Seed for reproducibility. Not actively used in the BIRCH algorithm.

  • n_jobs (int, default=1) – Number of parallel jobs to run. Default is 1 (no parallelism).

  • verbose (bool, default=False) – If True, enables verbose output during the clustering process.

create(config=None)[source]

Creates an instance of the Birch model using the provided configuration.

create(config=None)[source]
if_config(config=None)[source]

matclustering.methods.similarity.TSDBSCAN module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSDBSCAN.TSDBSCAN(eps=0.5, min_samples=5, random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

Trajectory Density-Based Spatial Clustering of Applications with Noise (DBSCAN) using similarity matrix.

The TSDBSCAN class implements the DBSCAN clustering algorithm, which is a density-based clustering method designed to discover clusters in large spatial datasets while effectively identifying noise. This implementation allows configuration of key parameters such as epsilon (eps) and minimum samples.

References

Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density- based algorithm for discovering clusters in large spatial databases with noise. In Kdd (Vol. 96, No. 34, pp. 226-231). <https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf>

Parameters:
  • eps (float or list of floats, optional) – The maximum distance between two samples for them to be considered as in the same neighborhood. Can also be a list for grid search.

  • min_samples (int, optional) – The number of samples (or total weight) in a neighborhood for a point to be considered as a core point. Default is 5.

  • random_state (int, optional) – Seed for random number generation, primarily for compatibility with other components. Default is 1.

  • n_jobs (int, optional) – The number of jobs to run in parallel for both fit and predict. Default is 1.

  • verbose (bool, optional) – If True, prints verbose output during processing. Default is False.

create(config=None):

Initializes and returns a DBSCAN model with the specified parameters.

create(config=None)[source]
if_config(config=None)[source]

matclustering.methods.similarity.TSKMeans module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSKMeans.TSKMeans(k=5, random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

Trajectory K-Means Clustering for Trajectory Data using similarity matrix.

The TSKMeans class implements the KMeans clustering algorithm, specifically designed for clustering trajectory data. This implementation allows for dynamic configuration of the number of clusters (k) and supports grid search for hyperparameter tuning.

Parameters:
  • k (int or list of int, optional) – The number of clusters to form. Can also be a list for grid search. Default is 5.

  • random_state (int, optional) – Seed for random number generation, ensuring reproducibility. Default is 1.

  • n_jobs (int, optional) – The number of jobs to run in parallel for both fit and predict. Default is 1.

  • verbose (bool, optional) – If True, enables verbose output during processing. Default is False.

create(config=None):

Initializes and returns a KMeans model with the specified parameters.

create(config=None)[source]
if_config(config=None)[source]

matclustering.methods.similarity.TSKMedoids module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSKMedoids.TSKMedoids(k=5, init=None, max_iter=300, random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

Trajectory K-Medoids Clustering using similarity matrix.

The TSKMedoids class implements the K-Medoids clustering algorithm, which is a robust alternative to K-Means, especially in the presence of noise and outliers.

References

Park, H. S., & Jun, C. H. (2009). A simple and fast algorithm for K-medoids clustering. Expert systems with applications, 36(2), 3336-3341. <https://www.sciencedirect.com/science/article/pii/S095741740800081X>

Parameters:
  • k (int, optional) – The number of clusters to form. Default is 5.

  • init (array-like or str, optional) – Initial medoids. If None, medoids will be chosen randomly. If ‘park’, uses the method proposed by Park and Jun (2009). Default is None.

  • max_iter (int, optional) – Maximum number of iterations for the algorithm to run. Default is 300.

  • random_state (int, optional) – Seed for random number generation, ensuring reproducibility. Default is 1.

  • n_jobs (int, optional) – The number of jobs to run in parallel for both fit and predict. Default is 1.

  • verbose (bool, optional) – If True, enables verbose output during processing. Default is False.

create(config=None):

Initializes and returns a K-Medoids model with the specified parameters.

fit(X, config=None):

Runs the K-Medoids clustering algorithm on the input data X.

create(config=None)[source]
fit(X, config=None)[source]
if_config(config=None)[source]

matclustering.methods.similarity.TSpectral module

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the clustering of multiple aspect trajectory data.It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Created on Apr, 2024 Copyright (C) 2024, License GPL Version 3 or superior (see LICENSE file)

Authors:
  • Tarlis Portela

  • Yuri Santos

class matclustering.methods.similarity.TSpectral.TSpectral(k=5, assign_labels='discretize', random_state=1, n_jobs=1, verbose=False)[source]

Bases: SimilarityClustering

Trajectory Spectral Clustering using similarity matrix.

Parameters:
  • k (int, optional) – The number of clusters to form. Default is 5.

  • assign_labels ({'kmeans', 'discretize', 'cluster_qr'}, optional) – Method of assigning labels to the clusters. - ‘kmeans’: uses K-Means to assign labels. - ‘discretize’: uses discretization for label assignment. - ‘cluster_qr’: uses QR clustering. Default is ‘discretize’.

  • random_state (int, optional) – Seed for random number generation, ensuring reproducibility. Default is 1.

  • n_jobs (int, optional) – The number of jobs to run in parallel for both fit and predict. Default is 1.

  • verbose (bool, optional) – If True, enables verbose output during processing. Default is False.

create(config=None):

Initializes and returns a Spectral Clustering model with the specified parameters.

create(config=None)[source]
if_config(config=None)[source]

Module contents