Bellow are some basic information about OR mappers and random thoughts.
Categorization of ORM
-
with concept of persistence context (unit of work, transparent synchronization)
Example: Hibernate, NHibernate, JDO, EclipseLink (former TopLink), OpenJPA (former Kodo), JPA specification
-
with metadata (statement oriented, any change must be explicitly stored)
Example: ActiveRecords (R'o'R), Ebean (www.avaje.org)
-
with query language
Example: LINQ (not limited to database, also supports updates, but the main purpose is query)
-
with lower level statement oriented frameworks (they are not considered to be ORM)
Example: iBatis SQL Maps
The ORM are not kind of Object Oriented database view of relation database, but they share some common points with OO databases (Objects, criteria queries, ...).
Why you should consider ORM?
-
The biggest advantage is the possibility to work with objects and not SQL statements. JDBC is very low level API, requires many statements and resource management. OR mapper hides the complexity.
-
Type system of DB is static, some databases allow user defined types, but this is not as sophisticated as in programming languages. ORM provides type abstraction, it maps between language specific types and database specific types (for instance currency amounts).
-
Switching database hidden behind ORM is rare case but OR mappers are popular for products where customers decide what kind of underlying database will be used.
-
Query Language (QL) works against objects and not database tables. With criteria or query by example there is even no text based QL but object based query.
-
Most ORM allows to fallback to native SQL queries for low level database tuning. The ORM then provides the mapping from result set to the objects.
-
There is some overhead, but usually not a showstopper with modern ORM's.
-
Good ORM generates pretty good SQL (knows the efficient SQL), it is better than typical hand coded SQL of average developer. Not every developer is database expert.
-
Unit of work is also provides first level cache.
What are the drawbacks of ORM, what you should be aware of?
-
Fullblown ORM are quite complex and require intimate knowledge in the team. Do not use them if the team doesn't have a ORM provider specialist. They require metadata and integration of unit of work (transactions borders) into code. The detailed knowledge of the OR mapper is required to understand all consequences and corner cases.
-
The actual queries may not be optimal for given database and it is not easy to optimize them. ORM still requires solid knowledge of the DB SQL language.
-
Using ORM for small amount of tables or simple application is not good choice, just more libraries and more configuration options
-
Most OR mappers doesn't manage bidirectional relations, this must be done from code.
Object identity
Most ORM mappers differentiate between components
(value objects) and entities
. Entity always has a constant id, in relational DB this is the primary key column(s). Object identity sometimes uses the primary key as object identity, but this is dangerous, because it may not be known until the new entity is persisted to database.
Object identity is bound to persistence context (unit of work, transaction) which contains the first level cache of ORM. It means that within the same persistence context the same object instance is always returned. But if the persistence context is closed or object is removed from persistence context, the object identity is no longer guarantied.
Objects first or database first?
When starting on new project the objects first seems to be better solution, but it is good to talk to database specialist about the mapping.
Even if ORM is used, hide it behind DAO
-
Use DAO to define standard CRUD behavior how to use ORM (direct access or proxies, session.merge vs session.saveOrUpdate in Hibernate)
-
Do not mix persistence specific code with business logic. Lazy loading is sometimes considered a backdoor to business logic, because it transparently triggers SQL.
-
allows to switch the persistence provider without change in business logic (doesn't happen very often in practice)
Caching
The unit of work (transaction) does the first level cache.
Second level cache is not the best solution, tune the queries first. Not all queries will benefit from second level cache (data that changes very often or data that are mission critical should not be cached at all). The caches can be quite complicated when they are synchronized across the cluster and transactions or when the DB is accessed by other application simultaneously.
Query caches are not usually that efficient, caches of entities and relations are better.
Stored procedures and ORM
Stored procedures are the fastest way to get data. ORM has limited support for stored procedures and it is rarely used. They do not play well with ORM. Stored procedure are very good for batch jobs, this is a place where ORM failed.