Generator
MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining
The present application offers a tool, to support the user in the preprocessing of multiple aspect trajectory data. It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)
Created on Dec, 2023 Copyright (C) 2023, License GPL Version 3 or superior (see LICENSE file)
- Authors:
Tarlis Portela
- matdata.generator.randomGenerator(N=10, M=50, L=10, C=10, random_seed=1, fileprefix='random', fileposfix='train', attr_desc=None, save_to=False, outformats=['csv'])[source]
Function to generate trajectories based on random data.
Parameters:
- Nint, optional
Number of trajectories (default 10)
- Mint, optional
Size of trajectories (default 50)
- Lint, optional
Number of attributes (default 10)
- Cint, optional
Number of classes (default 10)
- random_seedint, optional
Random Seed (default 1)
- attr_desclist of dict, optional
Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types) OR a list of instances of AttributeGenerator
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:
- pandas.DataFrame
The generated dataset.
- matdata.generator.samplerGenerator(N=10, M=50, C=1, random_seed=1, fileprefix='sample', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=False, base_data=None, outformats=['csv'])[source]
Function to generate trajectories based on real data.
Parameters:
- Nint, optional
Number of trajectories (default 10)
- Mint, optional
Size of trajectories, number of points (default 50)
- Cint, optional
Number of classes (default 1)
- random_seedint, optional
Random seed (default 1)
- cols_for_samplinglist, optional
Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’].
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- base_dataDataFrame, optional
DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:
- pandas.DataFrame
The generated dataset.
- matdata.generator.scalerRandomGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', attr_desc=None, save_to=None, save_desc_files=True, outformats=['csv'])[source]
Function to generate trajectory datasets based on random data.
Parameters:
- Nslist of int, optional
Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
- Mslist of int, optional
Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
- Lslist of int, optional
Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
- Cslist of int, optional
Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
- random_seedint, optional
Random seed (default 1)
- attr_desclist, optional
Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types)
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- fileprefixstr, optional
Output filename prefix (default ‘sample’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- save_desc_filesbool, optional
True if to save the .json description files, False otherwise (default True)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:
None
- matdata.generator.scalerSamplerGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=None, base_data=None, save_desc_files=True, outformats=['csv'])[source]
Generates trajectory datasets based on real data.
Parameters:
- Nslist of int, optional
Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
- Mslist of int, optional
Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
- Lslist of int, optional
Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
- Cslist of int, optional
Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
- random_seedint, optional
Random seed (default 1)
- fileprefixstr, optional
Output filename prefix (default ‘scalability’)
- fileposfixstr, optional
Output filename postfix (default ‘train’)
- cols_for_samplinglist or dict, optional
Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’]. If a dictionary is provided in the format: {‘aspectName’: ‘type’, ‘aspectName’: ‘type’}, it is used when providing base_data and saving .MAT.
- save_tostr or bool, optional
Destination folder to save, or False if not to save CSV files (default False)
- base_dataDataFrame, optional
DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
- save_desc_filesbool, optional
True if to save the .json description files, False otherwise (default True)
- outformatslist, optional
Output file formats for saving (default [‘csv’])
Returns:
None