Structuring dynamic columns in relational databases

by Hasan Hasanzade, Senior Full-Stack Developer

Relational databases are rigid in terms of structure. That's why they are great tool for structured data. Basically, you define your schema, your columns are fixed, and everything is neat and predictable. When designing database schemas you might encounter situations where data models are required to be flexible enough. What happens when that structure needs to be... well, less structured?

This is where the concept of dynamic columns comes into play.


Challenge 🧩

Consider any of the following scenarios:

  1. An e-commerce platform where different product categories have different attributes.

    • electronics - RAM, storage, screen size
    • clothes - size, color, material
    • books - ISBN, author, publication date

    In this scenario, more and more attributes can be added later on as new categories created.

  2. A trading platform that supports predefined payout methods for users, with additional required information based on payout method type selection.

    • pix - key type, address
    • swift - iban, swift code, bank address, account holder
    • crypto - network, currency, address

    In this scenario, these methods along with their corresponding columns are predefined and less likely to change. No more payout method/column is predicted.

There can be many other cases where you need to store attributes that vary between different items, change frequently or less frequently. While relational databases weren't initially designed for this kind of flexibility, there are several common strategies to handle dynamic attributes effectively.

Let's explore some of the most popular approaches.


1. The Entity-Attribute-Value (EAV) Model

The EAV model is a classic approach to handling dynamic attributes. Instead of having predefined columns for every possible attribute, you use a separate table(s) to store these attributes as rows.

Typically, an EAV setup involves three main tables, two parent tables and one pivot table with value column:

  • Entities table: This table holds your primary items (e.g., Users, Products).
  • Attributes table: This table defines all possible attributes that can be associated with your entities (e.g., Color, Size, Weight, RAM).
  • Entity-Attribute-Values table: This is the core of the EAV model. It is a polymorphic pivot table, that stores all entity types with their attribute values. It links an entity to an attribute and stores the corresponding value. It usually has columns like entity_type, entity_id, attribute_id, and value.

2. JSON/JSONB Columns

Modern version of DBMS like PostgreSQL, MySQL, and SQL Server support storing JSON directly in columns. This is a great option for semi-structured data.

Table definition will be just like this:

CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    attributes JSON
);

However, attributes column itself will have its own flexible noSQL structure.

As a result of implementing this approach for scenario 2, the resulting columns from users table can be like this:

...payout_method_typepayout_method_account...
...pix{"key_type":"CPF","address":"12345678901"}...
...swift{"iban":"asdf iban","swift_code":"asdf swift","bank_address":"asdf bank address","account_holder":"John Wick"}...
...crypto{"network":"TRON","currency":"USDT","address":"123"}...

There are RDBMS specific details about whether to use JSON or JSONB data types for this purpose. It is even possible to index the data within JSON column in PostgreSQL. SQL syntax is also available to query this type of column by keys in their nested structure.

While it has a neat balance between structure and flexibility, all kinds of data validation logic should be handled in application level.


3. Vertical partitioning

This approach is well-suited for scenarios where categories and their associated attributes are predefined and less likely to change as in scenario 2, but with stricter structural expectations. Since the number of categories is limited and known in advance, they can be represented as enums rather than stored as rows in a separate table.

Each category has its own dedicated table to store attributes, typically following a naming convention like %category%_attributes.

For example:

  • pix_attributes
  • swift_attributes
  • crypto_attributes

Within these category-specific tables, the columns for attributes are fixed and explicitly defined. This provides the benefit of strong typing, indexing, and better performance, while maintaining clarity and separation of concerns across different categories.


4. Sparse Columns

This is basically about predefining many nullable columns in advance. Column data types can be divided proportionally considering possible dynamic attribute data types. It is kind of reserving empty seats for coming data.

CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    -- nullable empty fields by datatype
    text_field_1 VARCHAR(255),
    text_field_2 VARCHAR(255),
    text_field_3 VARCHAR(255),
    int_field_1 INT,
    int_field_2 INT,
    date_field_1 DATE,
    -- ... many optional fields
);

Though it is simple in queries and good enough for performance, it really wastes space, has limited flexibility and requires schema changes if reserved columns are not enough. There is also limitation in number of columns that can be added to a table in RDBMS. Obviously, most of the time you don't hit that column limit, but still idea itself has some limitation, IMHO.

Not all RDBMS are good at handling multi-nullable columns. Microsoft SQL Server stands out among them, with that specific feature named sparse columns.


5. Hybrid approach

You know, you can combine any two or more of above approaches just to make new one. Based on different requirements for different modules/services, it is almost always beneficial to use this approach across the whole system.

Except when strict structural consistency is your goal.


Which is best to implement? 🤔

Correct answer: it depends!

There's no one-size-fits-all solution. The best approach depends on your specific needs:

  • For maximum flexibility and when attributes are highly unpredictable: EAV might be considered for the most flexible approach, but be aware of its complexities and performance implications.
  • For a good balance of flexibility, performance, and ease of use in modern databases: JSON columns are often the preferred method. They integrate well and offer powerful querying capabilities.
  • If you're using SQL Server and have many optional attributes: Sparse Columns can be an efficient choice.

Before settling on a strategy, consider the frequency of changes to the attribute set, the complexity of queries you'll need to run, performance requirements, and the capabilities of your chosen database system.


Conclusion ✍️

Relational databases aren't naturally dynamic — but with the right design patterns, like EAV, JSON, or a hybrid approach, you can introduce flexibility while still benefiting from structured storage, indexing, and transactions.

Which approach works best depends on your use case, performance needs, and how dynamic your data truly is.

Remember: the goal isn't to build the most flexible system possible, but to build the right balance of flexibility and maintainability for your specific use case. Sometimes a little rigidity in your schema can save you a lot of complexity in your application code.

More articles

Monitoring Application Built Fully on JavaScript

Building a desktop monitoring system with Electron, WebRTC, and Node.js

Read more

Reviewing your CSP policies and included scripts

Are you being mindful about your CSP policies and scripts?

Read more

Tell us about your project

lightfulweb OÜ

  • Tallin
    Sepapaja tn 6
    15551 Tallin, Estonia