An overview of the EAV model as a solution to the problem of needing to define custom attributes on entities at runtime

The Problem

One problem that comes up often when working with relational databases is that one can’t dynamically alter the schema on the fly. Many software will eventually have users asking for a way to define custom attributes. They’ll want to add fields like billing rep and their own internal tracking id to their invoices, but the other department handles things differently and wants a different set of fields instead of those, and our company wants to sell this software to other companies that will want a completely different set of fields for their workflow. Oh and they want to be able to search, filter, and aggregate by those fields so they can find specific invoices, see all the invoices a particular billing rep handled, and how much $ that sales rep brought in but only on orders of a particular type.

We could try altering the schema on the fly to add/remove columns for each of these custom attributes – a tricky and error prone ordeal. But we’d end up needing to dynamically build our queries to select the correct custom attributes of an entity, and our tables would now have a large number of columns with rows having an absurd number of null fields if only a small number of attributes might apply to any one entity.

So in summary the problems we have with these requirements are:

Enter the EAV Data Model

In the EAV model the attributes of an entity (such as a product or employee) are stored as entity-attribute-value pairs. Which means only the applicable attributes for any one entity need to be stored. Since the attribute is simply a key or foreign key pointing to an attribute definition in another table, we can dynamically modify the attributes through common DB operations at run time without the need of modifying the DB schema itself.

The basic model can be layed out in 3 tables:

The Entities Table

The plain old business entities. Employees, Invoices, Products, etc.

Id Name
1 “Product1”

The Attributes Table

This table contains the definitions for all the possible attributes the entities can have.

Id Name Type
1 “Color” “string”

The Entity-Attribute-Values Table

Finally, this table tells us what attributes a particular entity has and the values for each of those attributes.

Entity Id Attribute Id Attribute Value
1 1 “purple”

With this model, we can define custom attributes for each entity at run time. That simply involves adding/removing entries to the attributes table. Setting the values for the attributes of an entity is accomplished by modifying entries in the A-V table. Any non-applicable attributes to an entity are simply omitted entries for in the A-V table.

Reading isn’t too bad either. The queries will be more verbose but usually won’t require any unusual SQL syntax. At worst, we’ll need to use pivots if we want to read each attribute as a column.

The Problems with the EAV Model

The EAV data model isn’t a silver bullet though. It has evident trade offs that lead to some even calling it an anti-pattern. The EAV model lacks many of the advantages of relational databases.

Sill the EAV model is usually the simpler solution. The above example was only the basic form of the model and it can be customized further (at the cost of additional complexity) to overcome some of its drawbacks and tailor it for the specific problem at hand. For instance, additional columns could be added to the A-V table for each type of value that can be described (int, string, etc.) to get a more strongly typed schema.

Alternatives