More good reasons to use bind variables in cursors in Oracle

Bind variables are a best practice in Oracle database development. They offer significant benefits in performance, security, and maintainability. One of the main reasons to use bind variables is protection against SQL injection, but performance plays an equally important role. This post is concerned with the latter.

What are bind variables?

As a database developer you may not really get into contact with bind variables a lot, they often are implicitly defined when you write a PL/SQL code unit. For example:

create or replace function get_employee_data_json(p_employee_id number)
return json as
    l_emp_data json;
begin
    select /* findme */
        json{*} 
        into l_emp_data
    from 
        employees
    where
        employee_id = p_employee_id;
    
    return l_emp_data;
end;
/

If you check the shared pool for the statement in question after executing the function, you will find that Oracle substituted p_employee_id with a bind variable:

SQL> select
  2      sql_id,
  3      sql_text
  4  from
  5      v$sql
  6  where
  7*     lower(sql_text) like 'select json{*}%';

SQL_ID           SQL_TEXT                                                  
________________ _________________________________________________________ 
2x477afqtgn9g    SELECT JSON{*} FROM EMPLOYEES WHERE EMPLOYEE_ID = :B1   

Exactly what you wanted: a secure, reusable cursor. The :B1 you see in the SQL text is the bind variable. It is replaced with the proper value at runtime. Binds aren’t always captured, but if they are, you can find them and their values in v$sql_bind_capture

Why should you use bind variables?

Here is a little refresher why bind variables are so useful. Using bind variables provides the following benefits, listed in no particular order:

  1. Improved Performance and Scalability:
    • Bind variables enable the Oracle optimizer to reuse parsed execution plans from the shared pool in the library cache. Without them, each unique literal value in a query triggers a hard parse, consuming CPU and memory.
    • This reduces parsing overhead, especially in high-volume applications, leading to faster query execution and better scalability under load. For example, in OLTP systems, it can dramatically lower CPU usage and contention.
  2. Enhanced Security:
    • They prevent SQL injection attacks by separating SQL code from user input. Literals embedded directly in queries can be manipulated, but bind variables treat input as data, not executable code.
    • This aligns with Oracle’s security guidelines, such as those in the Database Security Guide, promoting safer application design.
  3. Resource Efficiency:
    • Fewer hard parses mean less strain on the shared pool and reduced library cache latch contention, which is critical in multi-user environments.
    • In large-scale deployments, this helps maintain consistent response times and avoids performance degradation during peak usage.
  4. Code Maintainability and Portability:
    • Using binds makes SQL statements more readable and reusable across different inputs, simplifying debugging and testing.
    • It ensures consistent behavior across Oracle versions and tools (e.g., SQL*Plus, JDBC, or OCI), as the database handles type binding automatically.
  5. Compliance with Best Practices:
    • Oracle recommends binds in documentation like the PL/SQL User’s Guide and Performance Tuning Guide. They’re essential for meeting SLAs in enterprise environments, where predictability and efficiency are key.

Quantifying the effect of not using bind variables

It’s not too hard to quantify the impact of not using bind variables from a performance point of view. A small Java application demonstrates the both pros and cons. And yes, I am conscious of the fact that your Java code probably uses a framework rather than vanilla JDBC, under the covers they are all running SQL commands though – concepts should be identical across the stacks.

Consider the following function, an example how not to write scalable code:

    private void startTrouble(Connection conn, long minId, long maxId, long[] randomNumbers) {

        long startTime = System.currentTimeMillis();

        for (int i = 0; i < numIterations; i++) {

            try (Statement stmt = conn.createStatement()) {
                ResultSet rs = stmt.executeQuery("select username from todo_users where user_id = " + randomNumbers[i]);
                while (rs.next()) {
                    String username = rs.getString(1);
                }

                if (i % 2500 == 0)
                    log.info("\titerations completed: {}", i);
                rs.close();

            } catch (SQLException e) {
                log.error("something went wrong in iteration " + i);
            }
        }

        long endTime = System.currentTimeMillis();
        long duration = (endTime - startTime);
        log.info("Wall Clock Time elapsed to process {} iterations in mode {}: {} ", numIterations, "trouble", duration);
    }

As you can see in the highlighted line of code, there is a 1:1 correlation between parse and execute.

Let’s contrast that with a Java function that does adhere to most of the industry’s best known methods:

    private void showHowItShouldBeDone(Connection conn, long minId, long maxId, long[] randomNumbers) {
        long startTime = System.currentTimeMillis();

        PreparedStatement pstmt = null;
        try {
            pstmt = conn.prepareStatement("select username from todo_users where user_id = ?");
        } catch (SQLException e) {
            log.error("couldn't create the prepared statement - this is bad! {}", e.getMessage());
            throw new RuntimeException(e);
        }

        for (int i = 0; i < numIterations; i++) {

            try {
                pstmt.setLong(1, randomNumbers[i]);

                ResultSet rs = pstmt.executeQuery();
                while (rs.next()) {
                    String username = rs.getString(1);
                }

                if (i % 2500 == 0)
                    log.info("\titerations completed: {}", i);
                rs.close();

            } catch (SQLException e) {
                log.error("something went wrong in iteration " + i);
            }
        }

        long endTime = System.currentTimeMillis();
        long duration = (endTime - startTime);
        log.info("Wall Clock Time elapsed to process {} iterations in mode {}: {} ", numIterations, "trouble", duration);
    }

The second function implements the same logic, but with a couple of noteworthy differences:

  1. A PreparedStatement is used instead of a Statement
  2. The PreparedStatement is initialised outside the main loop and therefore parsed exactly once
  3. For each iteration, the only change is the call to bind another random number to the cursor

Looking at the results, there is a clear winner:

SQL> with literal_or_bind as (
  2  select 
  3      sql_text,
  4      sharable_mem,
  5      runtime_mem,
  6      persistent_mem,
  7      loads,
  8      parse_calls,
  9      executions,
 10      cpu_time,
 11      case when 
 12          regexp_like(sql_text, 'select username from todo_users where user_id = [0-9]+') then 'literal' 
 13          else 'bind' 
 14      end predicate
 15  from
 16      v$sql
 17  where
 18      sql_text like 'select username from todo_users where user_id =%'
 19  )
 20  select
 21      count(*),
 22      sum(sharable_mem + runtime_mem + persistent_mem) memory_used,
 23      sum(loads),
 24      sum(parse_calls),
 25      sum(executions),
 26      sum(cpu_time),
 27      predicate
 28  from
 29      literal_or_bind
 30  group by
 31*     all;

   COUNT(*)    MEMORY_USED    SUM(LOADS)    SUM(PARSE_CALLS)    SUM(EXECUTIONS)    SUM(CPU_TIME) PREDICATE    
___________ ______________ _____________ ___________________ __________________ ________________ ____________ 
       1000       38868541          1000               30000              30000           607397 literal      
          1          39172             1                   1              30000           130623 bind         

SQL> 

Summary

Looking at the results it’s immediately obvious that there are huge benefits using shared cursors for scalability. All important numbers – memory, CPU etc are lower. Using bind variables there is less contention for the shared pool, too. And this is just 1 session doing work, consider a huge OLTP system with hundreds of concurrent users and different statements. Not using bind variables is a recipe for disaster in these circumstances, killing scalability and ensuring a terrible experience for everyone involved.