In PostgreSQL 12, oid columns in system tables will loose their “special” nature, and the optional clause WITH OIDS will disapppear from CREATE TABLE. As a concrete consequence, oid will now be visible when running select * from the catalogs that have OIDs, as well as when querying information_schema.columns, or with \d inside psql. Until now they were hidden, as are all system columns such as xmin or xmax.

The commit message in the source repository mentions this reason for the change:

author Andres Freund <andres (at) anarazel (dot) de>
Wed, 21 Nov 2018 01:36:57 +0200 (15:36 -0800)
[…]
Remove WITH OIDS support, change oid catalog column visibility.
[…]
The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that “specialness” significantly.

Pluggable storage is a step towards the much expected zheap, as well as other formats in the future.

Looking back years ago, this can be seen in the perspective of previous changes, which also go in the direction of obsoleting OIDs:

  • 7.2 (Feb 2002), the oid column becomes optional.
  • 8.0 (Jan 2005), the default_with_oids parameter is created.
  • 8.1 (Nov 2005), default_with_oids is now false by default.

But why was the OID as a special column invented in the first place? Originally, as the name “Object ID” suggests, the OID is related to object orientation.

A bit of history: object orientation in Postgres

In the mid-80’s, the object orientation concept was surfacing, with OO languages like C++ in their early stages of design. In the database world, there was also this idea that maybe in the future, people will want to look at their data primarily through OO lenses.

This can explain why in early versions of Postgres, when developed as a research project at the University of Berkeley, object orientation was a significant component.

In programming languages, the OO concept worked out quite well, for instance with the very successful C++ or Java. But in databases, the concept did not truly take off, and ended up limited to niche products.

When the community of developers took over Postgres in the mid-90’s with the goal of making it the powerful SQL engine that it became, they inherited a number of features clearly influenced by the object oriented paradigm, such as:

  • tables are classes.
  • rows in table are class instances.
  • a table may inherit its structure from a parent.
  • polymorphism in functions (through overloading) and operators.

As most of the other database communities, the Postgres developers did not find much interest in pursuing the OO vision after the mid-90’s. Instead they focused on other goals, such as improving performance, robustness and conformance to the still-evolving SQL standard.

Anyway, the idea that a row is considered to be a class instance pretty much implies the existence of a unique identifier beyond the user-defined columns, to differentiate one instance from another. If we compare with a programming language, where classes are instantiated in memory, a class instance can be distinguished from a copy by having a different address in memory, even if all the rest is identical. In a way, the OID in Postgres is like the address of the class instance, serialized into a form meant to go to disk.

The old documentation still online explains the “row as a class instance” point of view:

Concepts in PostgreSQL 6.4 (1998):

The fundamental notion in Postgres is that of a class, which is a named collection of object instances. Each instance has the same collection of named attributes, and each attribute is of a specific type. Furthermore, each instance has a permanent object identifier (OID) that is unique throughout the installation. Because SQL syntax refers to tables, we will use the terms table and class interchangeably. Likewise, an SQL row is an instance and SQL columns are attributes.

An example was given in Populating a Class with Instances:

The insert statement is used to populate a class with instances:
INSERT INTO weather VALUES (‘San Francisco’, 46, 50, 0.25, ‘11/27/1994’)
[…]
As previously discussed, classes are grouped into databases, and a collection of databases managed by a single postmaster process constitutes an installation or site.

After 7.1, released in 2001, the reference to classes has disappeared, and the documentation has replaced the “Creating a New Class” OO-tainted expression by the more mundane “Creating a New Table”.

In the catalogs, there are still remnants from this past, like the table of tables being named pg_class (but there is a view called pg_tables).

OIDs in modern PostgreSQL

OIDs as normal columns are still used in the catalogs as surrogate keys where such keys are needed. In PostgreSQL 12, 39 tables have a field named oid and 278 columns are of oid type (versus 39 and 274 in version 11)

postgres=# SELECT
 count(*) filter (where attname = 'oid') as "OID as name",
 count(*) filter (where atttypid = 'oid'::regtype) as "OID as type"
FROM pg_attribute JOIN pg_class ON (attrelid=oid) WHERE relkind='r';

 OID as name | OID as type 
-------------+-------------
          39 |         278

Also, OIDs are essential in managing large objects, that store binary contents with transparent segmentation, since the API exposes this object exclusively through their OIDs. The only user-visible change in v12 should be that pg_largeobject_metadata.oid is now directly visible, but a user doesn’t even need to query this table when using the API.

OIDs below 16384 are still reserved for the base system, in v12 as before.

The generator of values for OIDs is a counter at the cluster level, so that the values are sequentially distributed as if they came from a sequence shared by all databases.

That gives, for instance:

postgres=# create database db1;
CREATE DATABASE

postgres=# \lo_import .bashrc
lo_import 16404

postgres=# \c db1
You are now connected to database "db1" as user "daniel".

db1=# \lo_import .bashrc
lo_import 16405

This behavior with no duplicate of OIDs across databases, despite the fact that large objects in different databases are independant from each other, looks like a remnant from the time where each OID was “unique throughout the installation” as quoted above, in the 6.4 documentation.

This global unicity constraint has disappeared a long time ago, but the generator has kept this anti-collision behavior, such that collisions may only start to happen after a cycle of more than 4 billion values within the cluster (the OID counter being an unsigned 32-bit integer that goes back at 16384 when it reaches 2^32).

If you’re curious about the internals, go read the comments along the GetNewOidWithIndex() function to see how the code enforces unicity post-cycling, as well as the pg_nextoid SQL function that is intentionally not mentioned in the documentation.