Generator

MAT-Tools: Python Framework for Multiple Aspect Trajectory Data Mining

The present application offers a tool, to support the user in the preprocessing of multiple aspect trajectory data. It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods. Copyright (C) 2022, MIT license (this portion of code is subject to licensing from source project distribution)

Authors:

Tarlis Portela

matdata.generator.randomGenerator(N=10, M=50, L=10, C=10, random_seed=1, fileprefix='random', fileposfix='train', attr_desc=None, save_to=False, outformats=['csv'])[source]

Function to generate trajectories based on random data.

Parameters:

Nint, optional: Number of trajectories (default 10)
Mint, optional: Size of trajectories (default 50)
Lint, optional: Number of attributes (default 10)
Cint, optional: Number of classes (default 10)
random_seedint, optional: Random Seed (default 1)
attr_desclist of dict, optional: Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types) OR a list of instances of AttributeGenerator
save_tostr or bool, optional: Destination folder to save, or False if not to save CSV files (default False)
fileprefixstr, optional: Output filename prefix (default ‘sample’)
fileposfixstr, optional: Output filename postfix (default ‘train’)
outformatslist, optional: Output file formats for saving (default [‘csv’])

Returns:

pandas.DataFrame: The generated dataset.

matdata.generator.samplerGenerator(N=10, M=50, C=1, random_seed=1, fileprefix='sample', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=False, base_data=None, outformats=['csv'])[source]

Function to generate trajectories based on real data.

Parameters:

Nint, optional: Number of trajectories (default 10)
Mint, optional: Size of trajectories, number of points (default 50)
Cint, optional: Number of classes (default 1)
random_seedint, optional: Random seed (default 1)
cols_for_samplinglist, optional: Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’].
save_tostr or bool, optional: Destination folder to save, or False if not to save CSV files (default False)
fileprefixstr, optional: Output filename prefix (default ‘sample’)
fileposfixstr, optional: Output filename postfix (default ‘train’)
base_dataDataFrame, optional: DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
outformatslist, optional: Output file formats for saving (default [‘csv’])

Returns:

pandas.DataFrame: The generated dataset.

matdata.generator.scalerRandomGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', attr_desc=None, save_to=None, save_desc_files=True, outformats=['csv'])[source]

Function to generate trajectory datasets based on random data.

Parameters:

Nslist of int, optional: Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
Mslist of int, optional: Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
Lslist of int, optional: Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
Cslist of int, optional: Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
random_seedint, optional: Random seed (default 1)
attr_desclist, optional: Data type intervals to generate attributes as a list of descriptive dicts. Default: None (uses default types)
save_tostr or bool, optional: Destination folder to save, or False if not to save CSV files (default False)
fileprefixstr, optional: Output filename prefix (default ‘sample’)
fileposfixstr, optional: Output filename postfix (default ‘train’)
save_desc_filesbool, optional: True if to save the .json description files, False otherwise (default True)
outformatslist, optional: Output file formats for saving (default [‘csv’])

Returns:

None

matdata.generator.scalerSamplerGenerator(Ns=[100, 10], Ms=[10, 10], Ls=[8, 10], Cs=[2, 10], random_seed=1, fileprefix='scalability', fileposfix='train', cols_for_sampling=['space', 'time', 'day', 'rating', 'price', 'weather', 'root_type', 'type'], save_to=None, base_data=None, save_desc_files=True, outformats=['csv'])[source]

Generates trajectory datasets based on real data.

Parameters:

Nslist of int, optional: Parameters to scale the number of trajectories. List of 2 values: starting number, number of elements (default [100, 10])
Mslist of int, optional: Parameters to scale the size of trajectories. List of 2 values: starting number, number of elements (default [10, 10])
Lslist of int, optional: Parameters to scale the number of attributes (* doubles the columns). List of 2 values: starting number, number of elements (default [8, 10])
Cslist of int, optional: Parameters to scale the number of classes. List of 2 values: starting number, number of elements (default [2, 10])
random_seedint, optional: Random seed (default 1)
fileprefixstr, optional: Output filename prefix (default ‘scalability’)
fileposfixstr, optional: Output filename postfix (default ‘train’)
cols_for_samplinglist or dict, optional: Columns to add in the generated dataset. Default: [‘space’, ‘time’, ‘day’, ‘rating’, ‘price’, ‘weather’, ‘root_type’, ‘type’]. If a dictionary is provided in the format: {‘aspectName’: ‘type’, ‘aspectName’: ‘type’}, it is used when providing base_data and saving .MAT.
save_tostr or bool, optional: Destination folder to save, or False if not to save CSV files (default False)
base_dataDataFrame, optional: DataFrame of trajectories to use as a base for sampling data. Default: None (uses example data)
save_desc_filesbool, optional: True if to save the .json description files, False otherwise (default True)
outformatslist, optional: Output file formats for saving (default [‘csv’])

Returns:

None