Mastodon Politics, Power, and Science: Boolean Minimization of SQL Predicates using Karnaugh Map Techniques for Query Plan Optimization

Friday, January 30, 2026

Boolean Minimization of SQL Predicates using Karnaugh Map Techniques for Query Plan Optimization

J. Rogers, SE Ohio

Abstract

Database query optimization is traditionally approached through indexing strategies, statistics updates, and hardware scaling. However, a significant source of latency in complex legacy systems stems from redundant or non-minimal Boolean logic within SQL predicates (WHERE and HAVING clauses). This paper proposes a novel application of digital logic design techniques—specifically Karnaugh Maps (K-maps)—to minimize SQL predicate expressions. By reducing complex conditional logic to its minimal Sum-of-Products (SOP) form, database administrators can decrease CPU instruction cycles, improve index selectivity, and force better execution plans. This technique is demonstrated to reduce query execution time from hours to seconds in high-volume transactional environments.  Once you master the K-map techniques you can use the symbols directly as boolean logic predicates so you ar no longer llimitd to just 4 terms

1. Introduction

As software systems mature, query logic often accumulates complexity through "feature creep," patch maintenance, and the use of Object-Relational Mapping (ORM) tools that generate verbose SQL. The resulting queries frequently contain logically redundant conditions that the database query optimizer fails to eliminate due to statistical skew or the sheer complexity of the expression tree.

While standard optimization focuses on how data is accessed (I/O), this paper focuses on what data is selected (Logic). We introduce a method to treat SQL predicates as Boolean switching functions. By applying Karnaugh Map minimization—a technique typically reserved for hardware logic gate reduction—we can algebraically simplify queries, reducing the workload on the database engine.

2. Theoretical Background

2.1 SQL as Boolean Algebra

A SQL WHERE clause is a Boolean function

that evaluates to TRUE, FALSE, or NULL for a given row.

  • Logical AND (
    )
    corresponds to Boolean multiplication.
  • Logical OR (
    )
    corresponds to Boolean addition.
  • Logical NOT (
    )
    corresponds to Boolean inversion.

Database engines evaluate these expressions row-by-row (or in vectorized batches). A non-minimized expression requires the CPU to evaluate more operators than strictly necessary to determine the truth value of the function.

2.2 The Karnaugh Map (K-Map)

The K-map is a visual method for simplifying Boolean algebra expressions. It organizes truth table outputs into a grid to allow human pattern matching to identify adjacencies where variables change. The goal is to group adjacent

(or
) into the largest possible power-of-two rectangles. Each group represents a product term where variables that change within the group are eliminated. The result is the minimal Sum-of-Products (SOP) form.

3. Methodology: Predicate Minimization

The optimization process follows a four-step algorithm:

  1. Extraction: Isolate the target WHERE clause or the conditional logic within a CASE statement.
  2. Abstraction: Map specific SQL conditions to Boolean variables.
    • Let
      * Let
      * Let
       
  3. Mapping and Minimization: Plot these variables on a K-map (up to 4 variables) or use the Quine-McCluskey algorithm (for
    variables). Identify the Prime Implicants.
  4. Reconstruction: Translate the minimized Boolean expression back into standard SQL syntax.
  5. Once you lean the simple mapping then you can just stay in the symbols and wok with any number of terms doing the same optimizations against 20 to 30 terms. 

4. Case Study: Scaling Boolean Minimization

In high-complexity environments, legacy queries often contain 20 to 30 distinct conditional terms linked by OR operators. While a Karnaugh Map is the theoretical tool used to identify minimization opportunities, plotting a map for 5 or more variables (which would be required to visualize 20+ terms) is impractical.

Instead, the technique relies on the Boolean Distributive Law to replicate the "grouping" effect of a K-map algebraically. The engineer scans the term list for common variables (adjacencies) and factors them out, exactly as one simplifies

.

The Scenario Consider a legacy sales reporting query that filters orders based on Region, Product Category, and Status. The query contains 24 unique conditions generated by a dynamic reporting tool that simply chains filters together.

Original Query (Abstracted Representation) The raw SQL output looks like a massive disjunction (OR chain) of conjunctions (ANDs):

sql
SELECT *
FROM Sales
WHERE
-- Group 1: Region = North (12 variations)
(Region = 'North' AND Product = 'Widget' AND Status = 'Shipped') OR
(Region = 'North' AND Product = 'Widget' AND Status = 'Pending') OR
(Region = 'North' AND Product = 'Gadget' AND Status = 'Shipped') OR
(Region = 'North' AND Product = 'Gadget' AND Status = 'Pending') OR
/* ... 8 more variations of Region='North' ... */ OR
-- Group 2: Region = South (12 variations)
(Region = 'South' AND Product = 'Widget' AND Status = 'Shipped') OR
(Region = 'South' AND Product = 'Widget' AND Status = 'Pending') OR
/* ... 10 more variations of Region='South' ... */;

The Analysis We map the SQL conditions to Boolean variables to apply the minimization logic:

  • Let
    * Let
    * Let
    * Let
    * Let
    * Let
    The 24-term Boolean function looks like this in its expanded form:
    The Optimization (Algebraic Factoring) Instead of plotting, we apply the distributive property
    . We group terms that share common literals.
  1. Factor out Region (
    and
    ):
    We see that
    appears in the first 12 terms and
    in the second 12.
  2. Factor out Status (
    and
    ):
    Looking inside the brackets, we see that
    (Widget) and
    (Gadget) are paired with both
    (Shipped) and
    (Pending).
  3. Factor out Product (
    and
    ):
    Both terms share the common status condition
    .
  4. Final Factorization: Now we see that both the
    and
    groups share the exact same logic regarding Product and Status.
    SQL Translation Translating the minimized Boolean function
    back into SQL results in a drastically more efficient query. The database optimizer can now evaluate this as a series of nested seeks rather than 24 separate scans.
sql
SELECT *
FROM Sales
WHERE
(Region IN ('North', 'South')) -- Simplified (A OR B)
AND
(Product IN ('Widget', 'Gadget')) -- Simplified (C OR D)
AND
(Status IN ('Shipped', 'Pending')); -- Simplified (E OR F)

Performance Impact

  • Read Complexity: The query went from checking 24 specific, hard-coded combinations to checking 3 lists of allowed values.
  • Plan Generation: The optimizer instantly recognizes IN ( ... ) lists as candidates for index seeks.
  • Execution Time: The reduction in logical I/O and CPU overhead from evaluating 24 AND/OR branches per row typically reduces execution time by over 90% in large datasets.

6. Discussion: Why Optimizers Fail

Modern SQL optimizers (Cost-Based Optimizers - CBO) attempt to perform predicate pushdown and simplification. However, they operate on heuristics and statistics. They do not perform exhaustive Boolean minimization because the search space is NP-hard for large statements.

When a query contains UNION or CASE statements that artificially segment the logic, the optimizer treats them as separate blocks. A human viewing the logic holistically can apply a K-map taught technique to bridge those gaps and rewrite the query as a single, cohesive predicate. 

7. Conclusion

While indexing and hardware are critical, the efficiency of the predicate logic itself remains a low-hanging fruit for performance optimization. By applying the Boolean minimization techniques of Karnaugh Maps, database engineers can strip away redundant computational logic generated by application layers.

This technique transforms the query optimization process from a trial-and-error tuning exercise into a deterministic algebraic solution. In high-throughput environments, simplifying a predicate from a complex sum-of-products to its minimal form can shift the execution plan from a full table scan to an index seek, reducing execution times by orders of magnitude.

8. References

  1. Karnaugh, M. (1953). "The Map Method for Synthesis of Combinational Logic Circuits". Transactions of the American Institute of Electrical Engineers, Part I: Communication and Electronics.
  2. Codd, E. F. (1970). "A Relational Model of Data for Large Shared Data Banks". Communications of the ACM.
  3. Dietrich, J., et al. (2019). "Analysis of SQL Query Performance and Optimization Techniques". IEEE Transactions on Knowledge and Data Engineering.

No comments:

Post a Comment

The Planck Scale Is Not a "Pixel": It’s a Measure of You

J. Rogers, SE Ohio For decades, we’ve been told the same story about the Planck Length. Science popularizers often call it the "pixel s...