Rules To Better SQL Databases - Developers

​Here are some of the typical things that all SQL Server DBAs and Database developers should know. These rules are above and beyond the most basic textbook recommendations of:

  • ​Ensuring your databases are Normalized and in 3rd Normal Form 
  • Making sure you have primary keys, foreign keys and simple indexes to improve performance 
  • Making sure you Back up regularly 
  • ​Basic Naming conventions (see some of our object naming conventions) 
  • Minimizing resultset sizes and data over the wire​​​​

Hold on a second! How would you like to view this content?
Just the title! A brief blurb! Gimme everything!
  1. General - Do you know every object name should be owned by dbo?

    ​​The reason is that you avoid ownership chain problems. Where Mary owns an object, Fred can read the object and then he creates a proc and he gives permission to Tom to execute. But Tom cant because there is a product chain of ownership.​​​​

    CREATE PROCEDURE [Adam Cogan].[Sales by Year]

    @Beginning_Date DateTime,

    @Ending_Date DateTime AS

    SELECT Orders.ShippedDate



    ,DATENAME(yy,ShippedDate) AS Year

    FROM Orders

    INNER JOIN "vwOrderSubTotals"

    ON Orders.OrderID = "vwOrderSubTotals".OrderID

    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

    Figure: Bad example​

    CREATE PROCEDURE [dbo].[Sales by Year]

    @Beginning_Date DateTime,

    @Ending_Date DateTime AS

    SELECT Orders.ShippedDate



    ,DATENAME(yy,ShippedDate) AS Year

    FROM Orders

    INNER JOIN "vwOrderSubTotals"

    ON Orders.OrderID = "vwOrderSubTotals".OrderID

    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

    Figure: Good example​​​

  2. General - Do you know object name should follow your company Naming Conventions?

    1. SQL Server Object Naming Standard.aspx SSW's Standard for naming SQL Server Objects.
    2. SQL Server Stored Procedure Naming Standard SSW's Standard for naming Stored Procedures.
    3. SQL Server Indexes Naming Standard SSW's Standard for naming Indexes.
    4. SQL Server Relationship Naming Standard SSW's Standard for naming Relationships
    5. Use decreasing generality for table names ie. Client and ClientInvoice, then ClientInvoiceDetail.
    6. Don't use underscores, instead use upper and lower case ie. ClientInvoice is preferred over Client_Invoice.
    7. Table names should not use plurals ie. Client is preferred over Clients.
    8. Generally don't use abbreviations. But there are a few words that are so commonly used that they can be abbreviated. These are:
      • Quantity = Qty
      • Amount = Amt
      • Password = Pwd
    9. Prefix all Date fields with 'Date' ie. DateInvoiced. One extra use of this is you can have generic code that enables a Date control on this field.
    10. Suffix Percent fields with 'Pct' ie. SalesTaxPct.
    11. Only use alphabet characters. ie. don't use AustraliaListA$. Avoid the following characters in your object names in SQL Server. If you do not do this, you will need to constantly identify those ill-named objects with bracketed or quoted identifiers - otherwise, unintended bugs can arise. What characters and symbols should I avoid using when naming objects in Access and SQL Server databases
    12. Don't use reserved words on their own. ie. User, Count, Group, etc. They can be used if joined with other words. What are reserved words for SQL Server 2000?

  3. General - Do you know object name should not be a reserved word?

    SQL Server reserves certain keywords for its exclusive use. It is not legal to include the reserved keywords in a Transact-SQL statement in any location except that defined by SQL Server. No objects in the database should be given a name that matches a reserved keyword. If such a name exists, the object must always be referred to using delimited identifiers. Although this method does allow for objects whose names are reserved words, it is recommended that you do not name any database objects with a name that is the same as a reserved word. In addition, the SQL-92 standard implemented by Microsoft SQL Server defines a list of reserved keywords.​​​

    ​Avoid using SQL-92 reserved keywords for object names and identifiers, ie. User, Count, Group, etc. They can be used if joined with other words.
    What are reserved words for SQL Server 2000?
    Why avoid reserved words and spaces in object names?

  4. General - Do you know object name should not contain spaces?

    Spaces should be avoided. If an object name contains a space, it can cause problems later on for developers because the developer must remember to put the object name inside square brackets when referencing it.

    Why avoid reserved words and spaces in object names?

    ​We aim to never have to use square brackets in any of our databases.​​​​

  5. General - Do you know the naming convention for use on database server test and production?

    Generally, every client should have a dev and a test database, so the dev database need to have the postfix "Dev" and the test database need to have the postfix "Test"(E.g. SSWCRMDev, SSWCRMTest). However, you don't need any postfix for production database.​​​

    Figure: Database with bad names
    Figure: Database with standard names

  6. General - Do you know to not use "sp_rename" to rename objects?

    Do not use "sp_rename" to rename objects like stored procedures, views and triggers.​

    ​​​Object name should be the same as name used in the object's script (e.g. CREATE script for stored procedures, views and triggers). Inconsistency can happen when object is renamed with sp_rename, but its script is not updated.
  7. Middle Tier - Do you implement business logic in middle tier?

    Business logic/rules should be implemented in an object oriented language such as VB.NET and C#.  This dramatically increases the adaptability, extensibility and maintainability of the application.

    Implementing business logic in stored procedures have the disadvantage of being hard to test, debug and evolve, therefore, they should only implement basic data access logic.

    With the exception of some very heavy data oriented operations, it is excusable to use stored procedures to carry out some logic for performance reasons.

    Triggers are even more difficult as their behaviour is event based.  It is okay to use triggers for non-functional/infrastructural features such as logging changes, or maintain more complex relational integrity which cannot be enforced by a simple relationship.​​

  8. Middle Tier - Do you submit all dates to SQL Server in ISO format?

    All dates submitted to SQL Server must be in ISO format date. This ensures that language or database settings do not interfere with inserts and updates of data. You should NEVER need to change the default language of users or of the database in SQL Server. For example, any insert into a SQL Server database with Visual Basic should call Format(ctlStartDate,"yyyy-mm-dd") or VB.NET Ctype(ctlStartDate.Text,Date).ToString("yyyy-MM-dd") before attempting the insert or update. This will ensure consistency of treatment when dealing with dates in your SQL Server backend.​​


    print convert( datetime, '2003-07-01' )

    -- returns Jul 1 2003 12:00AM

    print convert( datetime, '01/07/2003' )

    -- returns Jan 7 2003 12:00AM

    print convert( datetime, '20030701' )

    -- returns Jul 1 2003 12:00AM


    print convert( datetime, '2003-07-01' )

    -- returns Jan 7 2003 12:00AM, opposite of above

    print convert( datetime, '01/07/2003' )

    -- returns Jul 1 2003 12:00AM, opposite of above

    print convert( datetime, '20030701' )

    -- returns Jul 1 2003 12:00AM, only one which is same as above

    ​Code - ISO format date is the best.​

    The extended format can still mix up month & day in some circumstances, whereas the basic format is the only one that always works correctly.

    To be even more pedantic, when you include the time as well as the date, the value isn't really an ISO value at all! The ISO representation of a date/time would be '20030701T0958', whereas for SQL you should send it as '20030701 09:58'. This isn't even the extended ISO format as it is missing the obligatory "T" character (ref. section 5.4.1 of the standard).

    (The standard does allow you to "be omitted in applications where there is no risk of confusing", but it doesn't allow you to add a space or mix basic date with extended time.)

    So, if you want to be absolutely correct then it may be best to remove the reference to ISO, so that your rule works for date/time as well as just dates.

    The technical term used in the SQL help is "Unseparated String Format" (easily searched for).

    The help specifies that this format is unaffected by the SET DATEFORMAT command (which depends on any locale settings for SQL Server or the computer it is installed on).

    "The SET DATEFORMAT session setting does not apply to all-numeric date entries (numeric entries without separators). Six- or eight-digit strings are always interpreted as ymd."

    ​What is ISO format date? ​

  9. Performance Tuning - Do you make sure to clear SQL server cache when performing benchmark tests?

    When you are tuning SQL statements you tend to play in SQL management studio for a while. During this time SQL caches your query's and execution plans.
    All well and good but when you are trying to speed up a existing query that is taking some time then you may not be making a difference even though your execution times are way down.

    You really need to clear SQL's cache (or buffer) every time you test the speed of a query. This prevents the data and/or execution plans from being cached, thus corrupting the next test.

    To clear SQL Server's cache, run DBCC DROPCLEANBUFFERS, which clears all data from the cache. Then run DBCC FREEPROCCACHE, which clears the stored procedure cache.​

    Figure: First call is after clearing the cache. The second one is without clearing the cache. (26 seconds vs 2 seconds)
  10. Relationships - Do you have FOREIGN KEY constraints on columns ending with ID?

    Columns ending with 'ID' should have FOREIGN KEY constraints​.
    Figure: Missing relationships​
  11. Relationships - Do you know to not use ON DELETE CASCADE clause when creating relationship?

    SQL Servers ON DELETE CASCADE functionality can be very dangerous. We recommend not using it. Imagine someone deletes customer and the orders are deleted. If you need to delete records in related tables, do it in code in the application as it gives you more control.​
  12. Relationships - Do you turn on referential integrity in relationships?

    Cascading referential integrity constraints allow you to define the actions SQL Server takes when a user attempts to delete or update a key to which existing foreign keys point. The REFERENCES clauses of the CREATE TABLE and ALTER TABLE statements support ON DELETE and ON UPDATE clauses:


    NO ACTION is the default if ON DELETE or ON UPDATE is not specified.​​

    ​Relationships should always have referential integrity turned on. If you turned it on after data has been added, you may have data in your database that violates your referential integrity rules.

    Figure: Recommended referential integrity constraints
  13. Relationships - Do you use NOT FOR REPLICATION clause when creating relationship?

    When NOT FOR REPLICATION is used with a Foreign Key relationship, the integrity of the relationship is not checked while the Replication Agent is logged in and performing replication operations. This allows changes to the data (such as cascading updates) be propagated correctly.
  14. Relationships - Do you use ON UPDATE CASCADE clause when creating relationship?

    The ON UPDATE CASCADE feature of SQL Server 2000 and above can save you time in writing application and stored procedure code. We recommend that you take advantage of it. It is also more efficient than using triggers to perform these updates.​​moved 

  15. Schema - Do you add zs prefix to table name?

    Any type of table in a database where that does not contain application data should be called zs. So when the other application (e.g. SSW SQL Deploy) or the programmer populates the table then it should be called zs (e.g. zsDate - the program populates it, zsVersion - the programmer populates it).​​

  16. Schema - Do you always have version tracking tables?

    ​We always use two tables for tracking versioning information:

    • _zsDataVersion tracks the schema changes, and which update script we are up to. This helps tremendously in determining which version of the scripts are still required between development, test, and production databases.
    • _zsVersionLatest tracks which version the front-end client should be. This allows us to give a warning to (or even deny) users who are connecting to the database while not using the right version of the front-end client.

    Please see "Is a Back-end structural change going to be a hassle?" on our Rules to Successful Projects.​

  17. Schema - Do you always use Varchar?

    ​​Use VARCHAR instead of CHAR, unless your data is almost always of a fixed length, or is very short. For example, a Social Security/Tax File number which is always 9 characters. These situations are rare. SQL Server fits a whole row on a single page, and will never try to save space by splitting a row across two pages. Running DBCC SHOWCONTIG against tables shows that a table with fixed length columns takes up less pages of storage space to store rows of data. General rule is that the shorter the row length, the more rows you will fit on a page, and the smaller a table will be. It allows you to save disk space and it means that any retrieval operation such as SELECT COUNT(*) FROM, runs much quicker against the smaller table.​​

  18. Schema - Do you avoid de-normalized fields with computed columns?

    We should always use computed columns (in SQL Server 2005 and later they can be persisted) to avoid these types of denormalized columns.​

    NormalizedFields_Bad.jpg style=
    Figure: Bad Example
    Figure: Good Example​

    Computed columns has some limitations - they cannot access fields in other tables, or other computed fields in the current table.

    We use user defined functions (UDF) to encapsulate our logic in reusable functions, this allows one computed column to use a function to call another function.

    Use the suffix Computed to clearly distinguish that this field is a computed field.

    ALTER FUNCTION [dbo].[udfEmpTime_TimeTotalComputed]


    @TimeStart as DateTime,

    @TimeEnd as DateTime





    -- This function returns the time difference in hours - decimal(8,6)

    RETURN (round(isnull(CONVERT([decimal](8,6),@TimeEnd - @TimeStart,(0))*(24),(0)),(2)))


    Figure: This is the user defined function
    Figure: Sett​ing up computed column in table designer
  19. Schema - Do you avoid using user-schema separation?

    ​​User-schema separation is a new feature introduced in SQL 2005.

    In SQL 2000:

    • All objects are owned by users
    • If a user is deleted, all these objects must be deleted or have the owner reassigned
    • In script the naming convention is databaseName.ownerName.objectName
    • You need to update all scripts when a user changes.

    User-schema separation solves this problem by adding another level of naming, and shifting ownership of database objects to the schema, not the user. So, is it worth doing? Unless you are working with a very large database (100+ tables), the answer is "no". Most smaller databases have all objects with owner "dbo", which is fine in most cases.​

    ​Figure: Bad Example - AdventureWorks using user schema - instead, keep it simple and avoid using user schema unnecessarily
    Figure: Good Example -​ Adventure works with user schema cleaned out (Good). Much simpler and more readable​​
  20. Schema - Do you create a consistent primary key column on your tables?

    Make sure you created a consistent primary key column named [TableName]+"ID" on your tables.​​

    ​ Employee.EmployeeID

    Figure: Good example

    Employee.ID, Employee.Employee_Code, Employee.Employee​

    Figure: Bad example​​
  21. Schema - Do you create clustered index on your tables?

    ​You're allowed one clustered index per table, so unless you are never going to query a table, you may as well choose a field to be part of a clustered index. Basically,
    1. Every table should have a clustered index;
    2. The clustered index should be a unique clustered index where possible;
    3. The clustered index should be on a single column where possible;​

    So how do you choose the right field? Depending on the usage pattern of a table, clustered indices should be created. If sets of related records are regularly retrieved from a table in an application, a clustered index could dramatically improve performance.

    For example, in an Order to OrderDetails relationship with OrderID as the joining key, items in an order are regularly retrieved in a bundle. A clustered index on the OrderID column in OrderDetails table will improve the performance of the application significantly.

    Another example, if a table is frequently used for reporting, and a date range is used to define the time scope of the report, a clustered index on the date column is suitable. In more technical terms, if queries such as...

    SELECT * FROM ReportTable WHERE ItemDate BETWEEN 1/1/2003 AND 1/2/2003 executed frequently, ItemDate is a good candidate column for a clustered index.​

  22. Schema - Do you create primary key on your tables?

    When you specify a PRIMARY KEY constraint for a table, SQL Server enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries.

    Although, strictly speaking, the primary key is not essential you can update tables in SQL Enterprise Manager without it - we recommend all tables have a primary key (except tables that have a high volume of continuous transactions). Especially, when you have a client like Access, it would help you to avoid the problems.​​

  23. Schema - Do you have a timestamp column?

    ​The SQL Server timestamp data type has nothing to do with times or dates. SQL Server timestamps are binary numbers that indicate the relative sequence in which data modifications took place in a database.​

    ​​All tables should have a timestamp column to aid concurrency checking. A timestamp improves update performance because only one column needs to be checked when performing a concurrency check (instead of checking all columns in a table for changes).

    Be aware that when replicating with a SQL Server CE Pocket PC device using SQL server, a timestamp column is added automatically.​​​​

  24. Schema - Do you have standard Tables and Columns?

    ​1. All tables should have the following fields:
    FieldSQL Server Field Properties
    CreatedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
    CreatedUserIdForeign Key to Users table, Allow Nulls=False
    ModifiedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
    ModifiedUserIdForeign Key to Users table, Allow Nulls=False
    Concurrencyrowversion Allow Nulls=Falsev

    ​The first three are examples of bad table records. The last one is an example of how this table structure should be entered.

    Figure: 3 bad examples and 1 good example of Row auditing

    Note #1: Never set the CreatedUtc field - instead use a default GETUTCDATE()
    Note #2: These fields offer basic row auditing that will cover the majority of applications. When an application has specific auditing requirements, they should be analysed to see if this approach is sufficient.

    2. All databases should have a table with one record to store application Defaults. This table should be called 'Control'.
    If the settings are not application-wide, but just for that user then an XML (do not use an INI file) for simple stuff might be better. Examples are saving the 'User' for logon, 'Select Date Range' for a report, form positions, etc.

    .NET programs have an Application.Configuration which exports to XML file (app.config) automatically. It works very well, and deployment is very simple. It's integrated right into the Visual Studio.NET designer as well.

    3. All databases should have a version table to record structural changes to tables. See SSW Rules to Better Code​
    4. Lookup tables that have just two columns should be consistent and follow this convention: CategoryId (int) and CategoryName (varchar(100)).

    The benefit is that a generic lookup form can be used. You will just need the generic lookup form pass in the TableName and Column1 and Column2.

    Note #3: The problem with the naming is the primary keys don't match
    Note #4: The benefit with the character primary key columns is that queries and query strings have meaning Eg. from this URL I can guess that it is in the business category.

  25. Schema - Do you know the maximum row size for a table?

    ​A​ tables' maximum row size should be less than the size of a single SQL Server data page (8060 bytes). Otherwise, data entry forms can give errors is not validated correctly.

  26. Schema - Do you not use indexes on RowGuid column?

    ​RowGuids (uniqueidentifier) are large fields (16 bytes) and are basically going to ALWAYS​ be unique.​

    SQL Server adds a RowGUID column to all tables if you are using Merge Replication (but doesn't add an index).​​

    RowGuids in general slow things down. Some people may consider using a RowGuid as their primary key. This is a bad idea because the index is going to be quite slow.... you are searching a large field. It goes without saying, NEVER have clustered index on a RowGuid column.​​

    ​Another little annoyance with RowGuids is whe​​n you are searching for one. You can't use > or < on a RowGuid column.

    ​Note: There are not many cases where a RowGuid should have an index on it. (Exception SSW SQL Total Compare which is a tool that compares data is in sync via rowguids and this makes it lots faster).

    Be aware that SQL server adds this column when you perform merge replication. There are not many cases where this should have an index on it. An exception is if you are using our utility SQL Total Compare.​

  27. Schema - Do you only use Unicode datatypes (nchar, nvarchar and ntext) in special circumstances?

    Columns defined using the nchar, nvarchar and ntext datatypes can store any character defined by the Unicode Standard, which includes all of the characters defined in the various English and Non-English character sets. These datatypes take twice as much storage space per characters as non-Unicode data types.

    ​It is not the disk space costs that are the concern. It is the 8060 limit, please refer to Maximum Capacity Specifications for SQL Server​ for details.

    ​If your database stores only English characters, this is a waste of space. Don't use Unicode double-byte datatypes such as nchar, nvarchar and ntext unless you are doing multilingual applications.

  28. Schema - Do you use "smalldatetime datatype", where possible, instead of "datetime datatype"?

    ​Most applications do not require the range and precision offered by the DateTime data type. When was the last time you needed to enter an order past the year of 2079? So you end up with better data integrity. Most business applications never need dates outside the range of 1900-2079.​

    ​More Information:
    In addition (I don't really care about this) but I get a smaller database.
    DateTime type takes up 8 bytes. It can store dates ranging from January 1, 1753, to December 31, 9999, with time values rounded to increments of .000, .003, or .007 milliseconds.

    A SmallDateTime type takes up only 4 bytes, as a consequence, it can only store dates ranging from January 1, 1900, through June 6, 2079, with accuracy to the minute. With a million records each with two date fields, you could save 8MB of storage space. More space could actually be saved if you have indices on those columns.So that is about 1 cent worth today :-)

  29. Schema - Do you use Bit/Numeric data type correctly?

    ​1. Bit data type

    Bit data from 0 to 1 (2 values only). Storage size is 1 byte.

    Columns of type bit cannot have indexes on them.  Also, SQL Server 7 only allows True or False values in a bit column. SQL 2000 introduced the ability to store NULL as well. Applications built for SQL Server 7 often does not expect this behaviour, and may create subtle runtime errors.

    ​​Columns of type bit should be prefixed with "Is" or a "Should" ie. IsInvoiceSent (y/n) or ShouldInvoiceBeSent (y/n) you can tell easily which way the boolean is directed. [more information on naming conventions​]

    This being said, fields of this type should generally be avoided because often a field like this can contain a date i.e. DateInvoiceSent (Date/Time) is prefered over InvoiceSent (y/n). If a date is inappropriate then we still recommend an int field over a bit field anyway, because bits are a pain in the butt :-)

    2. ​Tinyint data type

    Integer data from 0 through 255. Storage size is 1 byte.

    3. Smallint data type

    Integer data from -2^15 (-32,768) through 2^15-1 (32,767). Storage size is 2 bytes.

    4. Int data type

    Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31-1 (2,147,483,647). Storage size is 4 bytes. The SQL-92 synonym for int is integer.

    5. Bigint data type

    Integer (whole number) data from -2^63 (-9223372036854775808) through 2^63-1 (9223372036854775807). Storage size is 8 bytes.



    • Use smallint datatype instead of bit datatype - so it can be indexed;
    • Use int datatype, where possible, instead of bigint datatype - for saving disk space;
    • Use smallint datatype, where possible, instead of int datatype - for saving disk space;
    • Use tinyint datatype, where possible, instead of smallint datatype - for saving disk space;
  30. Schema - Do you use FillFactor of 90% for indexes and constraints?

    ​​Indexes should generally have a fillfactor of 90%. If the amount of data stored in the database does not prohibit rebuilding indexes, a fillfactor of 90% should be maintained to increase performance of inserts.​

    ​A table that expects a lot of insert operations could use a lower fillfactor.​​

  31. Schema - Do you use less than 24 characters for table names?

    ​​If a SQL Server table name is longer than 24 characters and is linked to an Access front-end, characters after the 24th will be truncated.​

  32. Schema - Do you use Natural or Surrogate primary keys?

    Now this is a controversial one. Which one do you use?

    1. A "Natural" (or "Intelligent") key is actual data
      • Surname, FirstName, DateOfBirth
    2. An "Acquired Surrogate" (or "Artifical" or "System Generated") key is NOT derived from data eg. Autonumber
      • eg. ClientID 1234
      • eg. ClientID JSKDYF
      • eg. ReceiptID 1234
    3. A "Derived Surrogate" (or "User Provided") key is indirectly derived from data eg. Autonumber
      • eg. ClientID SSW (for SSW)
      • eg. EmpID AJC (for Adam Jon Cogan)
      • eg. ProdID CA (for Code Auditor)
    4. A "GUID" key automatically generated by SQL Server​

    The problems with Natural Keys:

    • Because they have a business meaning, if that meaning changes (eg. they change their surname), then that value NEEDS to changed. Changing a value with data is a little hard - but a lot easier with Cascade Update.
    • The main problem is that the key is large and combined and this needs to be used in all joins

    The Problem with Acquired Surrogate Keys:

    • A surrogate key has no meaning to a user
    • It always requires a join when browsing a child table eg. The InvoiceDetail table

    The Problem with Derived Surrogate

    • The user needs to enter a unique value
    • Because they have a business meaning, if that meaning changes (eg. they change their company name), then that value MAY NEED to changed. Changing a value with data is a little hard - but a lot easier with Cascade Update
    • More likely to have a problem with Merge Replication

    The Problem with GUID key

    We like GUID keys. However, GUID generation produces essentially random numbers which cannot realistically be used as primary keys since new rows are inserted into the table at random positions leading to extremely slow inserts as the table grows to even a moderate size. Inserting into the middle of a table with clustered index, rather than appending to the end can potentially cause the database to have to move large portions of the data to accommodate space for the insert. This can be very slow.


    1. We dont use Natural keys ever
    2. We use Acquired Surrogate for some tables
      • eg. Invoice table
      • eg. Receipt table
    3. a combination of Acquired Surrogate and Derived Surrogate for other tables
      • eg. Customer table
      • eg. Employee table
      • eg. Product table
    When we say combination because if the user doesn't enter a value then we put a random value in (by a middle tier function, so it works with Access or SQL). eg. ClientID JSKDYF

    The user can then change the value to anything else and we validate it is not used, and then perform a cascade update - or if it is more then 3 levels deep we execute a stored proc. Unfortunately this is a complicated proc that cycles through all related tables and performs an UPDATE. Here is an example.

    The Derived Surrogate has the benefit being easy for people to remember and can be used in the interface or even the query string

    Over the years experience has lead me to the opinion that the natural vs surrogate key argument comes down to a style issue. If a client or employer has a standard one way or another, fine use it. If not, use whichever you method you prefer, recognizing that there may be some annoyances you face down the road. But don't let somebody criticize you because your style doesn't fit his preconceived notions.​

  33. Schema - Do you use separate lookup tables rather than one large lookup table for your lookup data?

    ​Advantage: Simplifies ORM Mapping​​​

    We prefer multiple lookup tables so they make more sense in ORM tools. E.g. you could have either:

    1. OrderType


    2. LookupTable

    But when you are obtaining the OrderType for an order, you would have


    Order.OrderType.OrderTypeID (Good)


    Order.LookupTable.Value (Not great as it is not clear what the nature of the lookup table is). If you have multiple lookups to the one table, you would need to do your mappings manually rather than using a tool.

    Advantage: Maintains Complete Referential Integrity without the need for triggers Advantage: Maintains Complete Referential Integrity without the need for triggers

    The other advantage of having separate lookup tables rather than one large one is that referential integrity is maintained.

    One issue with having one large table is that you can still enter invalid values in the Order.OrderTypeID column. E.g. if Order TypeIDs range from 1-3 and CustomerTypeIDs range from 4 to 10.

    If I put OrderTypeID = 10, then I will not get referential integrity errors (even though I should) because I have entered a value which exists in the lookup table (even though it is for the wrong type).

    If I want to enforce referential integrity so I can only enter the correct type for my lookup table, then I would need to resort to triggers or a (fallible) coded data tier.

    Advantage: You can add new columns specific to each lookup table

    For example, if a Lookup table (e.g. CustomerType) has an associated value (e.g. the field MaximumDebtAmount), we don't need to add a field that is irrelevant to all the other lookup tables. We can just add it to the individual lookup table.

    Disadvantage: Multiple tables make maintenance slightly more difficult, especially when making changes directly via Management Studio.

    It is simpler to Administer one table than multiple tables, but you can reduce this problem with a good Generic Administration Page UI.​

  34. Schema - Do you use triggers for denormalized fields?

    ​I believe that de-normalised fields are not a bad thing. When used properly and sparingly, they can actually improve your application's performance. As an example:

    • I have an Orders table containing one record per order
    • I also have an OrderItems table which contains line items linked to the main OrderID, as well as subtotals for each line item
    • In my front end I have a report showing the total for each order​

    To generate this report, I can either:

    1. Calculate the Order total by summing up every single line item for the selected Order every time the report is loaded, or
    2. Store the Order subtotal as a de-normalised field in the Orders table which gets updated using trigger.

    The second option will save me an expensive JOIN query each time because I can just tack the denormalised field onto the end of my SELECT query.

    ​1. Code: Alter Orders table​

    ALTER TABLE Orders
    ADD SumOfOrderItems money NULL

    2. Code: Insert trigger

    Alter Trigger tri_SumOfOrderItems
    On dbo.OrderItems
    For Insert
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID from inserted
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems +
    (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID

    3. Code: Update trigger

    Alter Trigger tru_SumOfOrderItems
    On dbo.OrderItems
    For Update
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID from deleted
    --Could have used inserted table
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems
    + (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
    - (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID

    4. Code: Delete trigger

    Alter Trigger trd_SumOfOrderItems
    On dbo.OrderItems
    For Delete
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID FROM deleted
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems -
    (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID

    5. Code: Maintenance stored procedure

    --Stored Procedure for Maintenance
    Alter Procedure dt_Maintenance_SumOfItemValue
    UPDATE Orders
    SET Orders.SumOfOrderItems = Isnull((SELECT SUM (ItemValue) FROM OrderItems WHERE OrderItems.OrderID = Orders.OrderID),0)

  35. Schema - Do you validate each "Denormalized Field" with procValidate?

    ​90% of the databases that SSW works with make use of denormalized fields. We believe this is with good reason. However, several precautions should be taken to ensure that the data held within these fields is reliable. This is particularly the case several applications are updating your denormalized data. To illustrate, let's say that we want to show all Customers with a calculated field totalling their order amount (ie Customer.OrderTotal).

    With this example in mind, the main reasons we use denormalized fields are:


    Reducing development complexity

    A denormalized field can mean that all SELECT queries in the database are simpler. Power users find it easier to use for reporting purposes - without the need for a cube. In our example, we would not need a large view to retrieve the data (as below).

    Customer.CustomerID, SUM (SalesOrderDetail.OrderQty * (SalesOrderDetail.UnitPrice - SalesOrderDetail.UnitPriceDiscount) ) AS DetailTotal, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate, Customer.rowguid FROM Customer
    INNER JOIN SalesOrderHeader ON Customer.CustomerID = SalesOrderHeader.CustomerID INNER JOIN SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID GROUP BY Customer.CustomerID, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate,Customer.rowguid ORDER BY Customer.CustomerID
    Figure: A view to get customer totals when no denormalized fields are used
    If we had a denormalized field, the user or developer would simply have run the following query:
    Customer.OrderTotal AS DetailTotal FROM Customer ORDER BY Customer.CustomerID
    Figure: Queries are much simpler with denormalized fields

    Note that this is not a particularly complicated example. However, you can see why it can simplify development greatly when working with a large number of tables

    Performance is better for read-intensive reports
    Particularly when reporting on data with a cube.

    When there a multiple tables in a SQL Server view They cannot be updated in one hit - they must be updated one table at a time.  

    It is a built-in validation device For example, if records are accidentally deleted directly in the database, there is still a validation check for the correct totals. The value of this is mitigated when there is a full audit log on the database

    However, there are reasons against using denormalized fields

    They have to be maintained and can potentially get out of synch

    This can makes them unreliable - particularly if several applications are incorrectly updating the denormalized fields. UPDATE, INSERT, DELETEs are more complicated as they have to update the denormalized fields

    They can be seen as an unnecessary waste of space

    All in all, we choose to still use denormalized fields because they can save development time. We do this with some provisos. In particular, they must be validated correctly to ensure the integrity of the data.

    Here is how we ensure that this data is validated:

    1. Change the description on any denormalized fields to include "Denormalized" in the description - "Denormalized: Sum(OrderTotal) FROM Orders" in description in Enterprise Manager
    2. Create a view that lists all the denormalized fields in the database - based on the description field.
      CREATE VIEW dbo.vwValidateDenormalizedFields
          SELECT OBJECT_NAME(id) AS TableName, 
              COL_NAME(id, smallid) AS ColumnName,
              CAST([value] AS VARCHAR(8000)) AS Description,
              'procValidate_' + OBJECT_NAME(id) + 
              '_' + COL_NAME(id, smallid) as
          FROM dbo.sysproperties
          WHERE (name = 'MS_Description') AND 
                       (CAST([value] AS VARCHAR(8000))
                        LIKE '%Denormalized:%')
      Figure: Standard view for validation of a denormalized field
    3. Create a stored procedure (based on the above view) that validates whether all denormalized fields have a stored procedure that validates the data within them
      CREATE PROCEDURE procValidateDenormalizedFieldValidators
              ValidationProcedureName AS
          FROM vwValidateDenormalizedFields
          WHERE ValidationProcedureName NOT IN
              SELECT ValidationProcedureName
              FROM vwValidateDenormalizedFields AS vw
              LEFT JOIN sysobjects 
                  vw.ValidationProcedureName = 
              WHERE id IS NOT NULL
      Figure: Standard stored procedure for validation of a denormalized field
    If you want to know how to implement denormalized fields, see our rules Do you use triggers for denormalized fields?
  36. Stored Procedures - Do not start user stored procedures with system prefix "sp_" or "dt_"?

    System stored procedures are created and stored in the master database and have the sp_ prefix. System stored procedures can be executed from any database without having to qualify the stored procedure name fully using the database name master. It is strongly recommended that you do not create any stored procedures using sp_ as a prefix. SQL Server always looks for a stored procedure beginning with sp_ in this order:

    1. The stored procedure in the master database.
    2. The stored procedure based on any qualifiers provided (database name or owner).
    3. The stored procedure using dbo as the owner, if one is not specified.

    Therefore, although the user-created stored procedure prefixed with sp_ may exist in the current database, the master database is always checked first, even if the stored procedure is qualified with the database name.

    Important: If any user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.

  37. Stored Procedures - Do you avoid using SELECT * when inserting data?

    ​Using a statement like "INSERT tableName SELECT * FROM otherTable", makes your stored procedures vulnerable to failure. Once either of the two tables changs, your stored procedure won't work. Not only that, when the inserting table has an identity column, such a statement will cause an error - "An explicit value for the identity column in table ParaRight can only be specified when a column list is used and IDENTITY_INSERT is ON."

    ​USE [ParaGreg]
    /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    IF @direction = 0
    INSERT INTO ParaRight
    SELECT * FROM ParaLeft
    WHERE ParaID = @id
    DELETE FROM ParaLeft
    WHERE ParaID = @id
    ELSE IF @direction = 1
    INSERT INTO ParaLeft
    SELECT * FROM ParaRight
    WHERE ParaID = @id
    DELETE FROM ParaRight
    WHERE ParaID = @id

    ​Bad example: Using SELECT * when inserting data. Besides, this stored procedure should have an Else section to raise error when no condition is satisfied

    USE [ParaGreg]
    /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    IF @direction = 0
    INSERT INTO ParaRight
    SELECT Col1,Col2 FROM ParaLeft
    WHERE ParaID = @id
    DELETE FROM ParaLeft
    WHERE ParaID = @id
    ELSE IF @direction = 1
    INSERT INTO ParaLeft
    SELECT * FROM ParaRight
    WHERE ParaID = @id
    DELETE FROM ParaRight
    WHERE ParaID = @id
    ELSE BEGIN PRINT "Please use a correct direction"

    Good example: Using concrete columns instead of * and provide an Else section to raise errors​​
  38. Stored Procedures - Do you check the global variable @@ERROR after executing a data manipulation statement?

    Always check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default, SQL Server will not rollback all the previous changes within a transaction if a particular statement fails. This behaviour can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT variable also plays an important role in determining how many rows were affected by a previous data manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a particular transaction.​​

  39. Stored Procedures - Do you keep your Stored Procedures simple?

    If you are using the .NET Framework, put validation and defaults in the middle tier. The backend should have the required fields (Allow Nulls = False), but no complicated constraints. The following are examples that work with the Products table (with an added timestamp field called Concurrency) from Northwind.

    ​1. Code: Select Procedure​​​

    ALTER PROCEDURE dbo.ProductSelect
    @ProductID int
    SELECT ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock,
    UnitsOnOrder, ReorderLevel, Discontinued, Concurrency
    FROM Products
    WHERE (ProductID= @ProductID)

    2. ​​Code: Insert Procedure​

    ALTER PROCEDURE dbo.ProductInsert
    @ProductName nvarchar(40),
    @SupplierID int,
    @CategoryID int,
    @QuantityPerUnit nvarchar(20),
    @UnitPrice money,
    @UnitsInStock smallint,
    @UnitsOnOrder smallint,
    @ReorderLevel smallint,
    @Discontinued bit
    INSERT INTO Products (ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice,
    UnitsInStock, UnitsOnOrder, ReorderLevel, Discontinued)
    VALUES (@ProductName, @SupplierID, @CategoryID, @QuantityPerUnit, @UnitPrice, @UnitsInStock,
    @UnitsOnOrder, @ReorderLevel, @Discontinued, 1)
    SELECT Scope_Identity() AS [SCOPE_IDENTITY] --If table has identity column
    --SELECT @@ROWCOUNT --If table doesn't have identity column
    -- Note: The middle tier must check the ROWCOUNT = 1

    3.Code: Update Procedure​​

    ALTER PROCEDURE dbo.ProductUpdate
    @ProductID int,
    @ProductName nvarchar(40),
    @SupplierID int,
    @CategoryID int,
    @QuantityPerUnit nvarchar(20),
    @UnitPrice money,
    @UnitsInStock smallint,
    @UnitsOnOrder smallint,
    @ReorderLevel smallint,
    @Discontinued bit,
    @Concurrency timestamp
    UPDATE Products
    SET ProductName = @ProductName,
    SupplierID = @SupplierID,
    CategoryID = @CategoryID,
    QuantityPerUnit = @QuantityPerUnit,
    UnitPrice = @UnitPrice,
    UnitsInStock = @UnitsInStock,
    UnitsOnOrder = @UnitsOnOrder,
    ReorderLevel = @ReorderLevel,
    Discontinued = @Discontinued
    WHERE (Concurrency = @Concurrency) AND (ProductID= @ProductID) --Note the double criteria to ensure concurrency
    -- Note: The middle tier must check the ROWCOUNT = 1

    4.Code: Delete Procedure​

    ALTER PROCEDURE dbo.ProductDelete
    @ProductID int,
    @Concurrency timestamp
    DELETE FROM Products
    WHERE (ProductID= @ProductID) AND (Concurrency = @Concurrency)
    --Note the double criteria to ensure concurrency
    --Note: The middle tier must check the ROWCOUNT = 1​​

  40. Stored Procedures - Do you know SQL stored procedure names should be prefixed with the owner?Do you know SQL stored procedure names should be prefixed with the owner?

    Always specify the schema prefix when creating stored procedures. This way you know that it will always be dbo.procedure_name no matter who is logged in when it is created.

    There are 2 other benefits to including the schema prefix on all object references:

    1. This prevents the database engine from checking for an object under the users schema first
    2. Also avoids the issue where multiple plans are cached for the exact same statement/batch just because they were executed by users with different default schemas​.

    Aaron Bertrand agrees with this rule - My stored procedure "be​st practices" checklist.

    CREATE PROCEDURE procCustomer_Update @CustomerID INT, ….. BEGIN​

    Figure: Bad example​​

    ​​​CREATE PROCEDURE dbo.procCustomer_Update @CustomerID INT, ….. BEGIN

    Figure: Good example​​​​
  41. Stored Procedures - Do you return a value indicating the status?

    ​Make sure your stored procedures always return a value indicating the status. All stored procedures should return the error number (if an error) or a 0 to indicate no errors (ie success).​​

  42. Stored Procedures - Do you SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes?

    ​You should use SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes (i.e. when you want the rowcounts to display as the messages from your T-SQL).​​

    According to SQL Server Books Online:
    "For stored procedures that contain several statements that do not return much actual data, this can provide a significant performance boost because network traffic is greatly reduced."

    Example: Procedure that returns a scalar value (ClientID generated by an insert statement) should use OUTPUT keyword (not RETURN) to pass back data. This is how you should return a generated ClientID from the procedure, and also return a status value

    CREATE PROCEDURE procClientInsert
    ' Copyright 2001 SSW
    ' All Rights Reserved.
    ' 1.0 DDK 17/12/2001
    'Calling example
    'DECLARE @pintClientID int
    'DECLARE @intReturnValue int
    'exec @intReturnValue = procClientInsert 'TEST Entry',
    @pintClientID OUTPUT
    'PRINT @pintClientID
    'PRINT @intReturnValue
    @pstrCoName varchar (254),
    @pintClientID int OUTPUT
    --Generate a random number
    SET @pintClientID = (SELECT CAST(RAND() * 100000000 AS int))
    INSERT INTO Client (ClientID, CoName) VALUES (@pintClientID ,
    IF @@ROWCOUNT = 1
    IF @@ERROR=0

    This procedure will display 0 or the error to indicate success or failure. You should base you actions on this return code.

    This separates return values from actual data so that other programmers know what to expect.

    Note: If you are using SQL Server stored procedures to edit or delete data using a SqlDataAdapter, make sure that you do not use SET NOCOUNT ON in the stored procedure definition. This causes the rows affected count returned to be zero, which the DataAdapter interprets as a concurrency conflict. In this event, a DBConcurrencyException will be thrown.​

  43. Stored Procedures - Do you standardize on the return values of stored procedures for success and failures?

    Standardize on the return values of stored procedures for success and failures.​​

  44. Stored Procedures - Do you use company standard description in your stored procedures?

    All stored procedures must have Company Description.
  45. Stored Procedures - Do you use OUTPUT parameters if you need to return the value of variables?

    The RETURN statement is meant for returning the execution status only, but not data. If you need to return value of variables, use OUTPUT parameters. There is a compelling reason for this - if you use return values rather than output values to return data, money values that you return will silently be truncated.​​

  46. Stored Procedures - Do you use SCOPE_IDENTITY() to get the most recent row identity?

    When inserting a row in a stored procedure, always use SCOPE_IDENTITY() if you want to get the ID of the row that was just inserted. A common error is to use @@IDENTITY, which returns the most recently created identity for your current connection, not necessarily the identity for the recently added row in a table. You could have a situation where there is a trigger that inserts a new record in a Logs Table, for example, when your Stored Procedure or INSERT SQL Statement inserts a record in the Orders Table. If you use @@IDENTITY to retrieve the identity of the new order, you will actually get the identity of the record added into the Log Table and not the Orders Table, which will create a nasty bug in your data access layer. To avoid the potential problems associated with someone adding a trigger later on, always use SCOPE_IDENTITY() to return the identity of the recently added row in your INSERT SQL Statement or Stored Procedure.

    ​Behold this example from SQL Server Books online.

    USE tempdb
    Z_id int IDENTITY(1,1)PRIMARY KEY,
    Z_name varchar(20) NOT NULL)
    VALUES ('Lisa')
    VALUES ('Mike')
    VALUES ('Carla')
    --Result set: This is how table TZ looks.
    Z_id Z_name
    1 Lisa
    2 Mike
    3 Carla
    Y_id int IDENTITY(100,5)PRIMARY KEY,
    Y_name varchar(20) NULL)
    INSERT TY (Y_name)
    VALUES ('boathouse')
    INSERT TY (Y_name)
    VALUES ('rocks')
    INSERT TY (Y_name)
    VALUES ('elevator')
    --Result set: This is how TY looks:
    Y_id Y_name
    100 boathouse
    105 rocks
    110 elevator
    /*Create the trigger that inserts a row in table TY
    when a row is inserted in table TZ*/
    ON TZ
    /*FIRE the trigger and determine what identity values you obtain
    with the @@IDENTITY and SCOPE_IDENTITY functions.*/
    INSERT TZ VALUES ('Rosalie')

    Notice the difference in the result sets. As you can see, it's crucial that you understand the difference between the 2 commands in order to get the correct ID of the row you just inserted.

    /*SCOPE_IDENTITY returned the last identity value in the same scope. This was the insert on table TZ.*/
    /*@@IDENTITY returned the last identity value inserted to TY by the trigger. This fired because of an earlier insert on TZ.*/​

  47. Stored Procedures - Do you use transactions for complicated stored procedures?

    ​A transaction means an atomic operation, it assures that all operations within the transaction are successful, if not, the transaction will cancel all operations and roll back to the original state of the database, that means no dirty data and mess exists in the database, so if a stored procedure has many steps, and each step has relation with other steps, it is strongly recommended that you encapsulate the procedure in a transaction.​

    ALTER PROCEDURE [dbo].[procInit]
    DELETE ParaLeft
    DELETE ParaRight
    INSERT INTO ParaLeft (ParaID)

    Bad example: No tran​saction here, if any of operations fail, the database will only partially update, resulting in an unwanted result

    ALTER PROCEDURE [dbo].[procInit]
    DELETE ParaLeft
    DELETE ParaRight
    INSERT INTO ParaLeft (ParaID)

    Good example: Using a transaction to assure that all operations within the transaction will be successful, otherwise, the database will roll back to original state​
  48. Views - Do you know to not have views as redundant objects?

    Don't have views as redundant objects. e.g. vwCustomers as SELECT * FROM Customers. This is unnecessary. Instead Views should be generally used for security.​​