The problems with Natural Keys:
- Because they have a business meaning, if that meaning changes (eg. they change their surname), then that value NEEDS to change. Changing a value with data is a little hard - but a lot easier with Cascade Update.
- The main problem is that the key is large and combined and this needs to be used in all joins
The Problem with Acquired Surrogate Keys:
- A surrogate key has no meaning to a user
- It always requires a join when browsing a child table eg. The InvoiceDetail table
The Problem with Derived Surrogate
- The user needs to enter a unique value
- Because they have a business meaning, if that meaning changes (eg. they change their company name), then that value MAY NEED to change. Changing a value with data is a little hard - but a lot easier with Cascade Update
- More likely to have a problem with Merge Replication
The Problem with GUID key
We like GUID keys. However, GUID generation produces essentially random numbers which cannot realistically be used as primary keys since new rows are inserted into the table at random positions leading to extremely slow inserts as the table grows to a moderate size. Inserting into the middle of a table with a clustered index, rather than appending to the end can potentially cause the database to have to move large portions of the data to accommodate space for the insert. This can be very slow.
- We do not use Natural keys ever
- We use Acquired Surrogate for some tables
a combination of Acquired Surrogate and Derived Surrogate for other tables
- eg. Invoice table
- eg. Receipt table
- eg. Customer table
- eg. Employee table
- eg. Product table
When we say combination because if the user doesn't enter a value then we put a random value in (by a middle tier function, so it works with Access or SQL). eg. ClientID JSKDYF
The user can then change the value to anything else and we validate it is not used, and then perform a cascade update - or if it is more then 3 levels deep we execute a stored proc. Unfortunately, this is a complicated proc that cycles through all related tables and performs an UPDATE. Here is an example.
The Derived Surrogate has the benefit of being easy for people to remember and can be used in the interface or even the query string
Over the years experience has lead me to the opinion that the natural vs surrogate key argument comes down to a style issue. If a client or employer has a standard one way or another, fine use it. If not, use whichever you method you prefer, recognizing that there may be some annoyances you face down the road. But don't let somebody criticize you because your style doesn't fit his preconceived notions.