Thursday, September 16, 2010

Generating ID sequences for Postgres using JPA annotations

This started out as yet another too-long blog post, so I'm going to refactor it by cutting to the chase:
@MappedSuperclass
public abstract class PostgresDomainObject {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
@Column(insertable=false, updatable=false,
columnDefinition="BigSerial not null")
public long id;
...
}
The BigSerial data type in Postgres is shorthand for the following:

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname

The key to making this work is GenerationType.IDENTITY. The insertable and updatable attributes don't seem to actually work - I've left them in as a indication of how the column behaves, but they don't seem to have any effect (at least not without IDENTITY). But using IDENTITY as the generation type causes Hibernate to not include the id column in insert statements, meaning that the default will get used.

The really beauty of this is that classes that extend PostgresDomainObject and declare @Entity will get their own dedicated sequence. If I had declared the column as a bigint and used GenerationType.AUTO, it would have created a single sequence, called "hibernate_sequence", and that sequence would have been used by every table. If I had declared the base class as @Entity and used the Hibernate/JPA table inheritance model (an atrocity I may rant about in some future post), I would again have been limited to using a single shared sequence.

There are two good reasons for wanting to have a dedicated sequence for each table instead of a single shared one:

1) Performance - having sequence generators dedicated to each to table helps to prevents contention when there are multiple concurrent requests for new sequence numbers

2) Maintainability - sequences are often allocated in batches (say, 50 at a time) and some errors can cause the entire batch to be skipped. If there is a recurring error on a busy system, it's even possible to exhaust the sequence. It's also possible to exhaust the sequence on a very large table that has a lot of turnover. In those cases, unlikely as they may be, it's a lot easier to track down the problem and fix it if it's a dedicated sequence.

If the database in question is a simple, low-volume data store, none of this matters much. But the systems I care about are at the other end of the spectrum - either high-volume OLTP systems or data warehouses that have tables with millions of rows and high levels of concurrent access.

15 comments:

  1. Awesome! Should be posted to SO.

    ReplyDelete
  2. Thanks! Now I can run my application on MySQL and Postgres without changing the GenerationType.

    ReplyDelete
  3. It is not working for primary and foreign key.
    Do you have any idea that how to generate sequence id for primary and foreign key table ?

    ReplyDelete
  4. I checked my code and found some mistake but It is working......
    Good article !!!!!!!!!!

    ReplyDelete
  5. Hi, I totally agree with you, the single sequence makes really no sense.
    Perfect explanation!
    Thanks

    ReplyDelete
  6. Great content thanks for sharing this informative blog which provided me technical information keep posting.
    Selenium training in Chennai
    Selenium training in Bangalore
    Selenium training in Pune
    Selenium Online training

    ReplyDelete
  7. Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts, have a nice weekend!keep it up!!!

    android training in chennai

    android online training in chennai

    android training in bangalore

    android training in hyderabad

    android Training in coimbatore

    android training

    android online training

    ReplyDelete