Using faker-js/SimpleFaker to produce test data in Oracle Database 23ai

After returning from an inspiring tech conference last week I had an agenda full of items to research. One of the talks I attended mentioned how simple it was for some popular open-source databases to generate sample data. Performing the same task with the Oracle Database was perceived to be quite tricky. This is only partially true as I hope to show in this post.

Generating realistic sample data using JavaScript

Oracle Database 23ai introduced much-enhanced support for In-Database JavaScript. Multilingual Engine (MLE), powered by GraalVM allows developers to use many open-source modules written for Deno or node.js inside the database. FakerJS is a very popular data generator, and it can be used in Oracle Database 23ai on Linux {x86-64/aarch64} to generate heaps of realistic-looking test data.

Please refer to the project’s GitHub project site for more details about its license and use implications. The article assumes your legal and IT Security departments (as well as any other party) agreed that using the module in your code is safe and compliant with your license. Using 3rd party code in your application typically requires specific compliance steps to be completed which are out of the scope of this article.

Loading SimpleFaker into the database

The first step is to get FakerJS from a Content Delivery Network (CDN) and load it into the database. Recent SQL Developer Command Line (SQLcl) feature the mle create-module command, greatly simplifying the way you load MLE modules into the database. Download faker-js/faker to a convenient location on your laptop. The module’s most current version can be found here:

https://cdn.jsdelivr.net/npm/@faker-js/faker/+esm

There are many other CDNs to choose from, pick the one that suits your needs best. Faker 9.5.1 was current at the time of writing, available here:

https://cdn.jsdelivr.net/npm/@faker-js/faker@9.5.1/+esm

After ensuring the source is trustworthy, the next important step is to ensure you download the ECMA Script (ESM) version of the module. If you used either of the above links you are fine. Fire up SQLcl and connect to your database to load the MLE module. Here is what this looks like on MacOS:

# download the source
$ curl -Lo faker.js 'https://cdn.jsdelivr.net/npm/@faker-js/faker@9.5.1/+esm'

# connect to the database and create the module
$ sql emily@localhost/freepdb1

SQLcl: Release 24.4 Production on Fri Mar 07 08:10:13 2025

Copyright (c) 1982, 2025, Oracle.  All rights reserved.

Password? (**********?) ***********
Connected to:
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.6.0.24.10

SQL> mle create-module -filename faker.js -module-name fakerjs_module -version 9.5.1
MLE Module fakerjs_module created

The MLE module has been created in your current schema.

If you’re getting errors at this stage ensure the requirements for creating MLE schema objects are met and that you have been granted execute privileges on SYS.JAVASCRIPT. Please refer to the MLE documentation to learn more about database privileges necessary for MLE.

It is good practice to provide the module’s version number as well. At the time of writing version 9.5.1 was the most recent, stable module version.

To keep this example short, let’s assume the only requirement is to get random …

  • numbers,
  • strings,
  • dates.

Each function should allow the end-user to provide a lower and an upper bound. In the case of the random string generation there should also be an option to indicate the resulting string’s case.

Exposing SimpleFaker to the database

This article uses the SimpleFaker API to generate test data. Recent MLE releases enhanced the Signature clause in call specification. The use of an “intermediate” MLE module is no longer required.

Let’s generate those random strings, numbers, and dates! If you browser SimpleFaker’s API reference, you can see that it exposes numbers, strings, and dates, just as required.

RequirementSimpleFaker API callAPI reference documentation
random numbersimpleFaker.number.int()https://fakerjs.dev/api/number.html#int
random stringsimpleFaker.string.alpha()https://fakerjs.dev/api/string.html#alpha
random datesimpleFaker.date.between()https://fakerjs.dev/api/date.html#between

Before you can use simpleFaker.number.int()simpleFaker.string.alpha() and simpleFaker.date.between() in SQL or PL/SQL you must create a call specification, mapping a SQL or PL/SQL call to the corresponding function in SimpleFaker. Here is an example:

create or replace package random_data_generator as

    -- see https://fakerjs.dev/api/number.html#int
    function random_number(
        p_max_value number
    ) return number
        as mle module fakerjs_module
        signature 'simpleFaker.number.int';
    
    function random_number(
        p_options json
    ) return number
        as mle module fakerjs_module
        signature 'simpleFaker.number.int';
    
    -- see https://fakerjs.dev/api/string.html#alpha
    function random_string(
        p_max_length number
    ) return varchar2
        as mle module fakerjs_module
        signature 'simpleFaker.string.alpha';
    
    function random_string(
        p_options json
    ) return varchar2
        as mle module fakerjs_module
        signature 'simpleFaker.string.alpha';
    
    -- see https://fakerjs.dev/api/date.html#between
    function random_date(
        p_options json
    ) return date
        as mle module fakerjs_module
        signature 'simpleFaker.date.between';

end random_data_generator;
/

From this point on, anyone who can call SQL and PL/SQL will have the ability to invoke the JavaScript code via its call specification.

Let’s try it

With the call-specification created, it’s time to create some sample data:

SQL> @random_data

SQL> -- generate a random number < 10
SQL> select
  2      random_data_generator.random_number(10);

   RANDOM_DATA_GENERATOR.RANDOM_NUMBER(10) 
__________________________________________ 
                                         0 

SQL> -- generate a random number ]10;20]
SQL> select
  2      random_data_generator.random_number(json('{ "min": 10, "max": 20 }'));

   RANDOM_DATA_GENERATOR.RANDOM_NUMBER(JSON('{"MIN":10,"MAX":20}')) 
___________________________________________________________________ 
                                                                 16 

SQL> -- generate a random string of 20 characters length
SQL> select
  2      random_data_generator.random_string(20);

RANDOM_DATA_GENERATOR.RANDOM_STRING(20)    
__________________________________________ 
cPqWexYrplzQzXtSlPvT                       

SQL> -- generate a random string, all lower case, between 5 and 10 characters in length
SQL> select
  2      random_data_generator.random_string(json('{ "length": { min: 5, max: 10 }, "casing": "lower" }'));

RANDOM_DATA_GENERATOR.RANDOM_STRING(JSON('{"LENGTH":{MIN:5,MAX:10},"CASING":"LOWER"}'))    
__________________________________________________________________________________________ 
wiwcpx                                                                                     

SQL> -- generate a random date between 01-JAN-2020 and 01-JAN-2030
SQL> select
  2      random_data_generator.random_date(
  3         json('{ "from": "2020-01-01T00:00:00.000Z", "to": "2030-01-01T00:00:00.000Z" }')
  4      );

RANDOM_DATA_GENERATOR.RANDOM_DATE(
       JSON('{ "FROM": "2020-01-01T00:00:00.000Z", "TO": "2030-01-01T00:00:00.000Z" }')
    )  
_________________________________________________________________________________________________________________________________ 
06-oct-2020

With the basic building block available, it’s easy to create lots of test data as well – for example, using PL/SQL for loops or Common Table Expressions (CTEs)

with lots_of_dates as (
    select
        random_data_generator.random_date(json('{ "from": "2020-01-01T00:00:00.000Z", "to": "2030-01-01T00:00:00.000Z" }')) rd
) 
select
    rd
from
    lots_of_dates
connect by level
    <= &how_much_data_do_you_want;

Summary

Generating simple test data isn’t particularly hard using SimpleFaker. Grab the module from your preferred CDN, load it into the database after carefully assessing license and security implications, expose the functionality you need, and off you go!