Appendix B. Synthetic Data Generator

Table of Contents

B.1. Generator for PostgreSQL Types
B.2. Generator for PostGIS Types
B.3. Generator for MobilityDB Time and Box Types
B.4. Generator for MobilityDB Temporal Types
B.5. Generation of Tables with Random Values
B.6. Generator for Temporal Network Point Types

In many circumstances, it is necessary to have a test dataset to evaluate alternative implementation approaches or to perform benchmarks. It is often required that such a data set have particular requirements in size or in the intrinsic characteristics of its data. Even if a real-world dataset could be available, it may be not ideal for such experiments for multiple reasons. Therefore, a synthetic data generator that could be customized to produce data according to the given requirements is often the best solution. Obviously, experiments with synthetic data should be complemented with experiments with real-world data to have a thorough understanding of the problem at hand.

MobilityDB provides a simple synthetic data generator that can be used for such purposes. In particular, such a data generator was used for generating the database used for the regression tests in MobilityDB. The data generator is programmed in PL/pgSQL so it can be easily customized. It is located in the directory datagen in the repository. In this appendix, we briefly introduce the basic functionality of the generator. We first list the functions generating random values for the various PostgreSQL, PostGIS, and MobilityDB data types, and then give examples how to use these functions for generating tables of such values. The parameters of the functions are not specified, please refer to the source files where detailed explanations about the various parameters can be found.

B.1. Generator for PostgreSQL Types

  • random_bool: Generate a random boolean

  • random_int: Generate a random integer

  • random_int_array: Generate a random array of integers

  • random_intrange: Generate a random integer range

  • random_float: Generate a random float

  • random_float_array: Generate a random array of floats

  • random_floatrange: Generate a random float range

  • random_text: Generate a random text

  • random_timestamptz: Generate a random timestamp with time zone

  • random_timestamptz_array: Generate a random array of timestamps with time zone

  • random_minutes: Generate a random interval of minutes

  • random_tstzrange: Generate a random timestamp with time zone range

  • random_tstzrange_array: Generate a random array of timestamp with time zone ranges