Rules To Better SQL Databases - Developers

​Here are some of the typical things that all SQL Server DBAs and Database developers should know. These rules are above and beyond the most basic textbook recommendations of:

  • ​Ensuring your databases are Normalized and in 3rd Normal Form 
  • Making sure you have primary keys, foreign keys and simple indexes to improve performance 
  • Making sure you Back up regularly 
  • ​Basic Naming conventions (see some of our object naming conventions) 
  • Minimizing resultset sizes and data over the wire​​​​

Hold on a second! How would you like to view this content?
Just the title! A brief blurb! Gimme everything!
  1. Data - Do you not allow Nulls in text fields?

    ​NULLs complicate your life. To avoid having to constantly differentiate between empty strings and NULLs, you should avoid storing NULLS if you can.

    Why? Well, what is wrong with this?​​​

    ​SELECT ContactName FROM Customer WHERE ContactName <> ''

    Figure: Selecting on empty string​

    Nothing if your data is perfect, but if you allow Nulls in your database, then statements like this will give you unexpected results. To get it working you would have to add the following to the last line:

    WHERE ContactName <> '' OR ContactName Is Null

    Figure: Allowing null strings makes queries more complex

    What about only allowing empty strings? Well, we choose to block Nulls because it is a lot easier to check off a check box in Enterprise Manager than it is to put a constraint on every field that disallows empty string ('').​

    SQLServerNullsAndEmptyStrings.gif
    Figure: Don't allow Nulls​

    However, you should always be aware that Nulls and empty strings are totally different, so if you absolutely have to have them, they should be used consistently. In the ANSI SQL-92 standard, an empty string ('') is never equated to Null, because empty string can be significant in certain applications. 

    Not allowing Nulls will give you the following benefits: 

    • ​​​​​Don't have to enforce every text field with a CHECK constraint such as ([ContactName]<>'').
    • Make your query simpler, avoid extra checking in stored procedures. So you don't have to check for NULLs and empty strings in your WHERE clause.
    • SQL Server performs better when nulls are not being used.
    • Don't have to deal with the pain in the middle tier to explicitly check DBNull.Value, you can always use contactRow.ContactName == String.Empty. Database Nulls in the .NET framework are represented as DBNull.Value and it cannot implicitly typecast to ANY other type, so if you are allowing NULLs in ContactName field, the above comparing will raise an exception.
    • Avoid other nasty issues, a lot of controls in the .NET framework have real problems binding to DBNull.Value. So you don't have write custom controls to handle this small thing.

    For example, you have Address1 and Address2 in your database, a Null value in Address2 means you don't know what the Address2 is, but an empty string means you know there is no data for Address2. You have to use a checkbox on the UI to explicitly distinguish Null value and empty string:​

    NullValueOnUI.jpg
    ​Figure: A check box is required if you want to allow user to use Null value on the UI

    Some people are not going to like this rule, but this is how it works in Oracle and Access:

    • In Oracle, empty strings are turned into Nulls (which is basically what this rule is doing). Empty strings per se are not supported in Oracle (This is not ANSI compliant).
    • And talking of legacy systems :-) be aware that using Access as a data editor is a "No-No". Access turns empty strings into a Null.

    Finally, always listen to the client, Nulls have meaning over an empty string - there are exceptions where you might use them - but they are rare.

    So follow this rule, block Nulls where possible, update your NULLs with proper information as soon as possible, and keep data consistent and queries simple.

  2. Data - Do you not allow NULLs in number fields if it has the same meaning as zero?

    ​NULLs create difficulty in the middle-tier because you need to add further handling. So avoid them where you can, eg. For a Discount field, make the default 0 and don't allow NULLs.

    ​This rule should not be applied when a NULL value is valid data. Often times data such as a percent earnings rate on a super fund is nullable because it may not be supplied or relevant. This is very different to it being zero and you have no way to determine real zero values from not supplied data. The hit of doing the work in code is often offset in this case by the validity of query results.

    As a general rule of thumbs, don't use NULL if you cannot distinguish it from another value.

    Q: What is the difference between NULL and 0 in discount field?
    A: No difference, so don't allow Nulls.

    Q: What is the difference between NULL and 0 in Tumor size?
    A: Null means unknown and 0 means no tumor, so allow Nulls.

    Note: Nulls are evil, but don't go crazy removing nulls. Never invent your own constant eg. -999 to represent a Null.

  3. Data - Do you not start data in character columns with empty line?

    Character columns (char, varchar, text, nchar, varchar, text) can store data as <Ctrl>+<Enter> in the first line and the rest of data in the second line.​​

    ​Note: If the front-end is Microsoft Access, then the data in the second line is not shown.​

  4. Data - Do you not start data in character columns with spaces?

    ​Text in character columns (char, varchar, text, nchar, varchar, text) can start with spaces which is usually data entry error.​

  5. Data - Do you use Identities in SQL Server (but don't use Autonumbers in Access)?

    ​​This one is going to be a controversial one. But the bottom line is every now and then you want to do something and then you curse and wish your database didn't have an identities. So why use them? Let's look at the problems first:​​​

    ​Con​s:
    • You can't manually change a Primary Key and let the Cascade Update do its work, eg. an InvoiceID
    • Hassles when importing data into related tables where you want to control the Primary Key eg. Order and Order Details
    • Replication you will get conflicts

    ​In Microsoft Access you have autonumbers and there is no way around them so never use them.
    But in SQL Server you have identities and we have these procs:

    • DBCC CHECKIDENT - Checks the current identity value for the specified table and, if needed, corrects the identity value
    • SET IDENTITY_INSERT { table } { ON | OFF } - Allows explicit values to be inserted into the identity column of a table
    Pros:​​
    • Less programming - letting the database take care of it
    • Replication (identities are supported by SQL Server with ranges so when you want replication, no coding
    • Avoiding concurrency errors on high INSERT systems so no coding

    So the only Con left is the importing of data but we can use one of the above procs to get around it. See grey box.

    The best way to import messy data into SQL Server (with Identities)​

    Eg. inserting data to the Orders and Orders Details table:

    ​Use an .adp to copy the first record to Excel
    Get the data into the same column orders
    --
    SET IDENTITY_INSERT Orders ON --this will allow manual identity INSERTS
    Copy and Paste Append the Orders
    SET IDENTITY_INSERT Orders OFF --as it can only be on for one table at a time
    --
    SET IDENTITY_INSERT [Order Details] ON --this will allow manual identity INSERTS
    Copy and Paste Append the [Order Details]
    SET IDENTITY_INSERT [Order Details] OFF​​

    Automatic Identity Range Handling​

    The simplest way of handling identity ranges across replicas is to allow SQL Server 2000 to manage identity range handling for you. To use automatic identity range handling, you must first enable the feature at the time the publication is created, assign a set of initial Publisher and Subscriber identity range values, and then assign a threshold value that determines when a new identity range is created.
    For example, assigning an identity range from 1000 through 2000 to a Publisher, and a range from 2001 through 3000 to an initial Subscriber a range from 3001 to 4000 to the next publisher etc.

  6. Data - Do you not delete records - Just flag them as Inactive?

    ​​When users are deleting a lot of records as part of normal operations - they can and do make mistakes. Instead of the painful process of having to go to a backup to get these records, why not simply flag the records as Inactive?​​

    Advant​ages
    • You do not have to delete all related records e.g. Customers, Orders, Order Details. Instead, you can just flag the parent record as deleted with an "IsDeleted" bit field.
    • You do not lose historical data e.g. how many products one of your previous clients purchased
    • You can actually see who deleted the record, as your standard audit columns (e.g. DateUpdated, UserUpdated are still there. The record does not just vanish
    • It is simple to implement - particularly when using a code generator. For example - our code generator produces views and stored procedures, and all data access layer code. With all data access done through the data layer views, we simply had to add a filter to all views ("WHERE IsActive = 0"). Our autogenerated delete stored procedures simply set the "IsActive" column to false.
    Disa​dvantages
    • Depending on your interface design, you may have to join to parent tables to ensure that deleted child records do not appear. Typically, the interface would be designed in such a way that you would not need be able to created new records based on the deleted items (e.g. you cannot create a new order record for a customer that is deleted). Performance of queries can potentially suffer if you have to do these joins.
    • While storage space is very cheap, you are not removing records from your database. You may need to archive records if the number of deleted records becomes large.

    Also see Using A​udit To​ols for alternatives to this approach using 3rd party auditing tools.​

  7. Data - Dates - Do you make sure you have valid date data in your database?

    ​SQL Server dates can range from the year 1900 up to the year 9999. However, certain date data in your database just wouldn't make any sense in the context of your business. For example, if your company started trading in 2015 you should not have any dates in your database before 2015 (unless you are tracking start dates of your clients, but this is an exception). An invoice date of 2013 wouldn't make sense at all. You should run validation queries to ensure no rubbish date data gets into your database.​
  8. Data - Dates - Do you know DateTime fields must be converted to universal time?

    Any DateTime fields must be converted to universal time from the application to the stored procedures when storing data into the database.

    When retrieving data from the database it must be converted back to the local time of the user.
    That way you get an accurate representation of e.g. the time someone entered data into the database (i.e. the DateUpdated field).

    The exception to this rule, however, is for already existing databases that deal with DateTime as part of their queries.
    e.g. SSW Time PRO.NET is an application that allows employees to enter their timesheet. The table used for storing this information has an important field that has a DateTime data type.
    This cannot be converted to UTC in the database because that would mean:
     

    1. Converting every single entry since entries began being stored (in SSW's case since 1996) to keep information consistent;
    2. Other separate applications currently using the timesheet information in the database for reporting will also have to be entirely modified.

    Currently there will be an issue if for example someone from the US (Pacific time) has 19 hours difference between her local time and our servers.

    ​​​Example: Sally in the US enters a timesheet for the 21/04/05. (which will default to have a time of 12:00:00 AM since the time was not specified)
    Our servers will store it as 21/04/05 19:00:00 in other words 21/04/05 07:00:00 PM because the .NET Framework will automatically convert the time accordingly for our Web Service.
    Therefore our servers have to take the Date component of the DateTime and add the Time component as 12:00:00 AM to make it stored in our local time format.

    [WebMethod]
    public double GetDateDifference(DateTime dateRemote)
    {
    DateTime dateLocal = dateRemote.Date;
    ​​return (dateRemote.TimeOfDay.TotalHours -
    ​​dateLocal.TimeOfDay.TotalHours);
    ​}

    Figure: When dateRemote is passed in from the remote machine, .Net Framework will have already converted it to the UTC equivalent for the local server (i.e. the necessary hours would have been added to cater for the local server time).

    In the above code snippet, the .Date property would cut off the Time portion of the DateTime variable and set the Time portion to "12:00:00 AM" as default.

    This is for applications we currently have that:

    1. Consider the DateTime component integral for the implementation of the application
    2. That will be used world wide.


  9. Data - Do you use 3rd party audit tools to audit data changes?

    ​In many cases, there are legal requirements to audit all updates to financial records. In other cases, you will want to be able to track and undo deletes to your database. Some solutions we have seen in the past are:

    1. Manually adding triggers on all database tables to log every table
    2. The business objects or stored procedures all write to 2 tables the main table such as Customer and CustomerAudit
    3. Using a logging utility to audit database changes

    This means that you can devote your development time to areas other than auditing. Also, unlike other utilities which use triggers (such as ApexSQL Audit), there is no performance overhead because it relies upon log files already created by SQL Server. If required, you can export the log information to SQL Server, so you can perform advanced queries on it. It even allows you to recover previously deleted tables.​

  10. Data - Do you not use invalid characters in object identifiers?

    ​We believe it is not good that use invalid characters (most of are Symbol characters, like ",;"\/(", etc.) in object identifiers. Though it is legal, it is easily confused and probably cause an error during run script on these objects.​

  11. Data - Do you check for invalid characters in character data?

    ​Always avoid invalid characters in your data (most of are Symbol characters, like ",;"\/\n\r", etc.). You usually get them in your database by mistake people usually get them from copy and pasting from Word.​

    T​​hey can be costly here is an example of an error you can get.

    InvalidCharacterInData1.gif

    What could this be? Well in this case the html source of http://www.ssw.com.au/ssw/Download/Download.aspx?GroupCategoryID=5BUS had this:

    <script language="Javascript">
    document.write('SSW Smart Tags for Word ');
    ...
    </script>


    There is an unwanted [return char] at the end of [SSW Smart Tags for Word]. So, then you trace it to a database record and find that the CategoryName field is SSW Smart Tags for Word? in that record. After you I remove the error chars, this bug was fixed all very expensive and all very costly.​​

  12. Data - Do you use a URL instead of an image in your database?

    ​​We recommend that you use a URL instead of an image in your database, this will make you

    • avoid the size of your database increasing too speedy (which may bring a serial of problems, like performance, log and disk space, etc);
    • easy to validate and change the image​​​



  13. Schema - Do you only use Unicode datatypes (nchar, nvarchar and ntext) in special circumstances?

    Columns defined using the nchar, nvarchar and ntext datatypes can store any character defined by the Unicode Standard, which includes all of the characters defined in the various English and Non-English character sets. These datatypes take twice as much storage space per characters as non-Unicode data types.

    ​It is not the disk space costs that are the concern. It is the 8060 limit, please refer to Maximum Capacity Specifications for SQL Server​ for details.

    ​If your database stores only English characters, this is a waste of space. Don't use Unicode double-byte datatypes such as nchar, nvarchar and ntext unless you are doing multilingual applications.

  14. Schema - Do you always use Varchar?

    ​​Use VARCHAR instead of CHAR, unless your data is almost always of a fixed length, or is very short. For example, a Social Security/Tax File number which is always 9 characters. These situations are rare. SQL Server fits a whole row on a single page, and will never try to save space by splitting a row across two pages. Running DBCC SHOWCONTIG against tables shows that a table with fixed length columns takes up less pages of storage space to store rows of data. General rule is that the shorter the row length, the more rows you will fit on a page, and the smaller a table will be. It allows you to save disk space and it means that any retrieval operation such as SELECT COUNT(*) FROM, runs much quicker against the smaller table.​​



  15. Schema - Do you have standard Tables and Columns?

    ​1. All tables should have the following fields:
    FieldSQL Server Field Properties
    CreatedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
    CreatedUserIdForeign Key to Users table, Allow Nulls=False
    ModifiedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
    ModifiedUserIdForeign Key to Users table, Allow Nulls=False
    Concurrencyrowversion Allow Nulls=Falsev

    ​The first three are examples of bad table records. The last one is an example of how this table structure should be entered.

    imgGoodBadPracticesExampleSQLFields.png
    Figure: 3 bad examples and 1 good example of Row auditing

    Note #1: Never set the CreatedUtc field - instead use a default GETUTCDATE()
    Note #2: These fields offer basic row auditing that will cover the majority of applications. When an application has specific auditing requirements, they should be analysed to see if this approach is sufficient.

    2. All databases should have a table with one record to store application Defaults. This table should be called 'Control'.
    If the settings are not application-wide, but just for that user then an XML (do not use an INI file) for simple stuff might be better. Examples are saving the 'User' for logon, 'Select Date Range' for a report, form positions, etc.

    .NET programs have an Application.Configuration which exports to XML file (app.config) automatically. It works very well, and deployment is very simple. It's integrated right into the Visual Studio.NET designer as well.

    3. All databases should have a version table to record structural changes to tables. See SSW Rules to Better Code​
     
    4. Lookup tables that have just two columns should be consistent and follow this convention: CategoryId (int) and CategoryName (varchar(100)).

    The benefit is that a generic lookup form can be used. You will just need the generic lookup form pass in the TableName and Column1 and Column2.

    Note #3: The problem with the naming is the primary keys don't match
    Note #4: The benefit with the character primary key columns is that queries and query strings have meaning Eg. http://www.ssw.com.au/ssw/Download/Download.aspx?GroupCategoryID=5BUS from this URL I can guess that it is in the business category.

  16. Schema - Do you use Bit/Numeric data type correctly?

    ​1. Bit data type

    Bit data from 0 to 1 (2 values only). Storage size is 1 byte.

    Columns of type bit cannot have indexes on them.  Also, SQL Server 7 only allows True or False values in a bit column. SQL 2000 introduced the ability to store NULL as well. Applications built for SQL Server 7 often do not expect this behaviour and may create subtle runtime errors.

    ​​Columns of type bit should be prefixed with "Is" or a "Should" ie. IsInvoiceSent (y/n) or ShouldInvoiceBeSent (y/n) you can tell easily which way the boolean is directed. [more information on naming conve​ntions​]

    This being said, fields of this type should generally be avoided because often a field like this can contain a date i.e. DateInvoiceSent (Date/Time) is prefered over InvoiceSent (y/n). If a date is inappropriate then we still recommend an int field over a bit field anyway, because bits are a pain in the butt :-)

    2. ​Tinyint data type

    Integer data from 0 through 255. Storage size is 1 byte.

    3. Smallint data type

    Integer data from -2^15 (-32,768) through 2^15-1 (32,767). Storage size is 2 bytes.

    4. Int data type

    Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31-1 (2,147,483,647). Storage size is 4 bytes. The SQL-92 synonym for int is integer.

    5. Bigint data type

    Integer (whole number) data from -2^63 (-9223372036854775808) through 2^63-1 (9223372036854775807). Storage size is 8 bytes.​​

    Recommended:

    • Use smallint datatype instead of bit datatype - so it can be indexed;
    • Use int datatype, where possible, instead of bigint datatype - for saving disk space;
    • Use smallint datatype, where possible, instead of int datatype - for saving disk space;
    • Use tinyint datatype, where possible, instead of smallint datatype - for saving disk space;
  17. Schema - Do you use Natural or Surrogate primary keys?

    Now this is a controversial one. Which one do you use?

    1. A "Natural" (or "Intelligent") key is actual data
      • Surname, FirstName, DateOfBirth
    2. An "Acquired Surrogate" (or "Artifical" or "System Generated") key is NOT derived from data eg. Autonumber
      • eg. ClientID 1234
      • eg. ClientID JSKDYF
      • eg. ReceiptID 1234
    3. A "Derived Surrogate" (or "User Provided") key is indirectly derived from data eg. Autonumber
      • eg. ClientID SSW (for SSW)
      • eg. EmpID AJC (for Adam Jon Cogan)
      • eg. ProdID CA (for Code Auditor)
    4. A "GUID" key automatically generated by SQL Server​

    The problems with Natural Keys:

    • Because they have a business meaning, if that meaning changes (eg. they change their surname), then that value NEEDS to change. Changing a value with data is a little hard - but a lot easier with Cascade Update.
    • The main problem is that the key is large and combined and this needs to be used in all joins

    The Problem with Acquired Surrogate Keys:

    • A surrogate key has no meaning to a user
    • It always requires a join when browsing a child table eg. The InvoiceDetail table

    The Problem with Derived Surrogate

    • The user needs to enter a unique value
    • Because they have a business meaning, if that meaning changes (eg. they change their company name), then that value MAY NEED to change. Changing a value with data is a little hard - but a lot easier with Cascade Update
    • More likely to have a problem with Merge Replication

    The Problem with GUID key

    We like GUID keys. However, GUID generation produces essentially random numbers which cannot realistically be used as primary keys since new rows are inserted into the table at random positions leading to extremely slow inserts as the table grows to a moderate size. Inserting into the middle of a table with a clustered index, rather than appending to the end can potentially cause the database to have to move large portions of the data to accommodate space for the insert. This can be very slow.

    Recommendations

    1. We do not use Natural keys ever
    2. We use Acquired Surrogate for some tables
      • eg. Invoice table
      • eg. Receipt table
    3. a combination of Acquired Surrogate and Derived Surrogate for other tables
      • eg. Customer table
      • eg. Employee table
      • eg. Product table
    When we say combination because if the user doesn't enter a value then we put a random value in (by a middle tier function, so it works with Access or SQL). eg. ClientID JSKDYF

    The user can then change the value to anything else and we validate it is not used, and then perform a cascade update - or if it is more then 3 levels deep we execute a stored proc. Unfortunately, this is a complicated proc that cycles through all related tables and performs an UPDATE. Here is an example.

    The Derived Surrogate has the benefit of being easy for people to remember and can be used in the interface or even the query string

    Over the years experience has lead me to the opinion that the natural vs surrogate key argument comes down to a style issue. If a client or employer has a standard one way or another, fine use it. If not, us​e whichever you method you prefer, recognizing that there may be some annoyances you face down the road. But don't let somebody criticize you because your style doesn't fit his preconceived notions.​

  18. Schema - Do you know the maximum row size for a table?

    ​A​ tables' maximum row size should be less than the size of a single SQL Server data page (8060 bytes). Otherwise, data entry forms can give errors is not validated correctly.

  19. Schema - Do you create primary key on your tables?

    When you specify a PRIMARY KEY constraint for a table, SQL Server enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries.

    Although, strictly speaking, the primary key is not essential you can update tables in SQL Enterprise Manager without it - we recommend all tables have a primary key (except tables that have a high volume of continuous transactions). Especially, when you have a client like Access, it would help you to avoid the problems.​​

  20. Schema - Do you create clustered index on your tables?

    ​You're allowed one clustered index per table, so unless you are never going to query a table, you may as well choose a field to be part of a clustered index. Basically,
    1. Every table should have a clustered index;
    2. The clustered index should be a unique clustered index where possible;
    3. The clustered index should be on a single column where possible;​

    So how do you choose the right field? Depending on the usage pattern of a table, clustered indices should be created. If sets of related records are regularly retrieved from a table in an application, a clustered index could dramatically improve performance.

    For example, in an Order to OrderDetails relationship with OrderID as the joining key, items in an order are regularly retrieved in a bundle. A clustered index on the OrderID column in OrderDetails table will improve the performance of the application significantly.

    Another example, if a table is frequently used for reporting, and a date range is used to define the time scope of the report, a clustered index on the date column is suitable. In more technical terms, if queries such as...

    SELECT * FROM ReportTable WHERE ItemDate BETWEEN 1/1/2003 AND 1/2/2003

    ...is executed frequently, ItemDate is a good candidate column for a clustered index.​

  21. Schema - Do you use "smalldatetime datatype", where possible, instead of "datetime datatype"?

    ​Most applications do not require the range and precision offered by the DateTime data type. When was the last time you needed to enter an order past the year of 2079? So you end up with better data integrity. Most business applications never need dates outside the range of 1900-2079.​

    ​More Information:
    In addition (I don't really care about this) but I get a smaller database.
    DateTime type takes up 8 bytes. It can store dates ranging from January 1, 1753, to December 31, 9999, with time values rounded to increments of .000, .003, or .007 milliseconds.

    A SmallDateTime type takes up only 4 bytes, as a consequence, it can only store dates ranging from January 1, 1900, through June 6, 2079, with accuracy to the minute. With a million records each with two date fields, you could save 8MB of storage space. More space could actually be saved if you have indices on those columns.So that is about 1 cent worth today :-)

  22. Schema - Do you not use indexes on RowGuid column?

    ​​RowGuids (uniqueidentifier) are large fields (16 bytes) and are basically going to ALWAYS​ be unique.​

    SQL Server adds a RowGUID column to all tables if you are using Merge Replication (but doesn't add an index).​​

    RowGuids in general slow things down. Some people may consider using a RowGuid as their primary key. This is a bad idea because the index is going to be quite slow.... you are searching a large field. It goes without saying, NEVER have clustered index on a RowGuid column.​​

    ​Another little annoyance with RowGuids is when you are searching for one. You can't use > or < on a RowGuid column.

    ​Note: There are not many cases where a RowGuid should have an index on it. 

    Be aware that SQL server adds this column when you perform merge replication. There are not many cases where this should have an index on it.

  23. Schema - Do you have a rowversion column?

    ​SQL Server rowversions are binary numbers that indicate the relative sequence in which data modifications took place in a database.​

    ​​All tables should have a rowversion column to aid concurrency checking. A rowversion improves update performance because only one column needs to be checked when performing a concurrency check (instead of checking all columns in a table for changes).​

  24. Schema - Do you use FillFactor of 90% for indexes and constraints?

    ​​​Indexes should generally have a fillfactor of 90%. If the amount of data stored in the database does not prohibit rebuilding indexes, a fillfactor of 90% should be maintained to increase the performance of inserts.​

    ​A table that expects a lot of insert operations could use a lower fillfactor.​​

  25. Schema - Do you use less than 24 characters for table names?

    ​​If a SQL Server table name is longer than 24 characters and is linked to an Access front-end, characters after the 24th will be truncated.​

  26. Schema - Do you always have version tracking tables?

    ​We always use two tables for tracking versioning information:

    • _zsDataVersion tracks the schema changes, and which update script we are up to. This helps tremendously in determining which version of the scripts are still required between development, test, and production databases.
    • _zsVersionLatest tracks which version the front-end client should be. This allows us to give a warning to (or even deny) users who are connecting to the database while not using the right version of the front-end client.

    Please see "Is a Back-end structural change going to be a hassle?" on our Rules to Successful Projects.​

  27. Schema - Do you validate each "Denormalized Field" with procValidate?

    ​90% of the databases that SSW works with make use of denormalized fields. We believe this is with good reason. However, several precautions should be taken to ensure that the data held within these fields is reliable. This is particularly the case several applications are updating your denormalized data. To illustrate, let's say that we want to show all Customers with a calculated field totalling their order amount (ie Customer.OrderTotal).

    With this example in mind, the main reasons we use denormalized fields are:

     

    Reducing development complexity

    A denormalized field can mean that all SELECT queries in the database are simpler. Power users find it easier to use for reporting purposes - without the need for a cube. In our example, we would not need a large view to retrieve the data (as below).

    SELECT 
    Customer.CustomerID, SUM (SalesOrderDetail.OrderQty * (SalesOrderDetail.UnitPrice - SalesOrderDetail.UnitPriceDiscount) ) AS DetailTotal, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate, Customer.rowguid FROM Customer
    INNER JOIN SalesOrderHeader ON Customer.CustomerID = SalesOrderHeader.CustomerID INNER JOIN SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID GROUP BY Customer.CustomerID, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate,Customer.rowguid ORDER BY Customer.CustomerID
    Figure: A view to get customer totals when no denormalized fields are used
    If we had a denormalized field, the user or developer would simply have run the following query:
    SELECT 
    Customer.CustomerID,
    Customer.OrderTotal AS DetailTotal FROM Customer ORDER BY Customer.CustomerID
    Figure: Queries are much simpler with denormalized fields

    Note that this is not a particularly complicated example. However, you can see why it can simplify development greatly when working with a large number of tables

    Performance is better for read-intensive reports
    Particularly when reporting on data with a cube.

    When there a multiple tables in a SQL Server view They cannot be updated in one hit - they must be updated one table at a time.  

    It is a built-in validation device For example, if records are accidentally deleted directly in the database, there is still a validation check for the correct totals. The value of this is mitigated when there is a full audit log on the database

    However, there are reasons against using denormalized fields

    They have to be maintained and can potentially get out of synch

    This can makes them unreliable - particularly if several applications are incorrectly updating the denormalized fields. UPDATE, INSERT, DELETEs are more complicated as they have to update the denormalized fields

    They can be seen as an unnecessary waste of space

    All in all, we choose to still use denormalized fields because they can save development time. We do this with some provisos. In particular, they must be validated correctly to ensure the integrity of the data.

    Here is how we ensure that this data is validated:

    1. Change the description on any denormalized fields to include "Denormalized" in the description - "Denormalized: Sum(OrderTotal) FROM Orders" in description in Enterprise Manager
    2. Create a view that lists all the denormalized fields in the database - based on the description field.
      CREATE VIEW dbo.vwValidateDenormalizedFields
      AS
          SELECT OBJECT_NAME(id) AS TableName, 
              COL_NAME(id, smallid) AS ColumnName,
              CAST([value] AS VARCHAR(8000)) AS Description,
              'procValidate_' + OBJECT_NAME(id) + 
              '_' + COL_NAME(id, smallid) as
              ValidationProcedureName
          FROM dbo.sysproperties
          WHERE (name = 'MS_Description') AND 
                       (CAST([value] AS VARCHAR(8000))
                        LIKE '%Denormalized:%')
      
      Figure: Standard view for validation of a denormalized field
    3. Create a stored procedure (based on the above view) that validates whether all denormalized fields have a stored procedure that validates the data within them
      CREATE PROCEDURE procValidateDenormalizedFieldValidators
      AS
          SELECT 
              ValidationProcedureName AS
              MissingValidationProcedureName 
          FROM vwValidateDenormalizedFields
          WHERE ValidationProcedureName NOT IN
          (
              SELECT ValidationProcedureName
              FROM vwValidateDenormalizedFields AS vw
              LEFT JOIN sysobjects 
              ON 
                  vw.ValidationProcedureName = 
                  OBJECT_NAME(sysobjects.id)
              WHERE id IS NOT NULL
          )
      
      Figure: Standard stored procedure for validation of a denormalized field
    If you want to know how to implement denormalized fields, see our rules Do you use triggers for denormalized fields?
  28. Schema - Do you avoid using user-schema separation?

    ​​User-schema separation is a new feature introduced in SQL 2005.

    In SQL 2000:

    • All objects are owned by users
    • If a user is deleted, all these objects must be deleted or have the owner reassigned
    • In script the naming convention is databaseName.ownerName.objectName
    • You need to update all scripts when a user changes.

    User-schema separation solves this problem by adding another level of naming, and shifting ownership of database objects to the schema, not the user. So, is it worth doing? Unless you are working with a very large database (100+ tables), the answer is "no". Most smaller databases have all objects with owner "dbo", which is fine in most cases.​

    SQLDatabases_UserSchema_Bad.jpg
    ​Figure: Bad Example - AdventureWorks using user schema - instead, keep it simple and avoid using user schema unnecessarily
    SQLDatabases_UserSchema_Good.jpg
    Figure: Good Example -​ Adventure works with user schema cleaned out (Good). Much simpler and more readable​​
  29. Schema - Do you use triggers for denormalized fields?

    ​I believe that de-normalised fields are not a bad thing. When used properly and sparingly, they can actually improve your application's performance. As an example:

    • I have an Orders table containing one record per order
    • I also have an OrderItems table which contains line items linked to the main OrderID, as well as subtotals for each line item
    • In my front end I have a report showing the total for each order​

    To generate this report, I can either:

    1. Calculate the Order total by summing up every single line item for the selected Order every time the report is loaded, or
    2. Store the Order subtotal as a de-normalised field in the Orders table which gets updated using trigger.

    The second option will save me an expensive JOIN query each time because I can just tack the denormalised field onto the end of my SELECT query.

    ​1. Code: Alter Orders table​

    ALTER TABLE Orders
    ADD SumOfOrderItems money NULL


    2. Code: Insert trigger

    Alter Trigger tri_SumOfOrderItems
    On dbo.OrderItems
    For Insert
    AS
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID from inserted
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems +
    (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID


    3. Code: Update trigger

    Alter Trigger tru_SumOfOrderItems
    On dbo.OrderItems
    For Update
    AS
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID from deleted
    --Could have used inserted table
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems
    + (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
    - (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID

    4. Code: Delete trigger

    Alter Trigger trd_SumOfOrderItems
    On dbo.OrderItems
    For Delete
    AS
    DECLARE @OrderID varchar (5)
    SELECT @OrderID = OrderID FROM deleted
    UPDATE Orders
    SET Orders.SumOfOrderItems = Orders.SumOfOrderItems -
    (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
    WHERE Orders.OrderID = @OrderID

    5. Code: Maintenance stored procedure

    --Stored Procedure for Maintenance
    Alter Procedure dt_Maintenance_SumOfItemValue
    As
    UPDATE Orders
    SET Orders.SumOfOrderItems = Isnull((SELECT SUM (ItemValue) FROM OrderItems WHERE OrderItems.OrderID = Orders.OrderID),0)

  30. Schema - Do you create a consistent primary key column on your tables?

    Make sure you created a consistent primary key column named [TableName]+"ID" on your tables.​​

    ​ Employee.EmployeeID

    Figure: Good example

    Employee.ID, Employee.Employee_Code, Employee.Employee​

    Figure: Bad example​​
  31. Schema - Do you use separate lookup tables rather than one large lookup table for your lookup data?

    ​Advantage: Simplifies ORM Mapping​​​

    We prefer multiple lookup tables so they make more sense in ORM tools. E.g. you could have either:

    1. OrderType

    Or

    2. LookupTable

    But when you are obtaining the OrderType for an order, you would have

    Either

    Order.OrderType.OrderTypeID (Good)

    Or

    Order.LookupTable.Value (Not great as it is not clear what the nature of the lookup table is). If you have multiple lookups to the one table, you would need to do your mappings manually rather than using a tool.

    Advantage: Maintains Complete Referential Integrity without the need for triggers Advantage: Maintains Complete Referential Integrity without the need for triggers

    The other advantage of having separate lookup tables rather than one large one is that referential integrity is maintained.

    One issue with having one large table is that you can still enter invalid values in the Order.OrderTypeID column. E.g. if Order TypeIDs range from 1-3 and CustomerTypeIDs range from 4 to 10.

    If I put OrderTypeID = 10, then I will not get referential integrity errors (even though I should) because I have entered a value which exists in the lookup table (even though it is for the wrong type).

    If I want to enforce referential integrity so I can only enter the correct type for my lookup table, then I would need to resort to triggers or a (fallible) coded data tier.

    Advantage: You can add new columns specific to each lookup table

    For example, if a Lookup table (e.g. CustomerType) has an associated value (e.g. the field MaximumDebtAmount), we don't need to add a field that is irrelevant to all the other lookup tables. We can just add it to the individual lookup table.

    Disadvantage: Multiple tables make maintenance slightly more difficult, especially when making changes directly via Management Studio.

    It is simpler to Administer one table than multiple tables, but you can reduce this problem with a good Generic Administration Page UI.​

  32. Schema - Do you avoid de-normalized fields with computed columns?

    We should always use computed columns (in SQL Server 2005 and later they can be persisted) to avoid these types of denormalized columns.​

    NormalizedFields_Bad.jpg style=
    Figure: Bad Example
    NormalizedFields_Good.jpg
    Figure: Good Example​

    Computed columns has some limitations - they cannot access fields in other tables, or other computed fields in the current table.

    We use user defined functions (UDF) to encapsulate our logic in reusable functions, this allows one computed column to use a function to call another function.

    Use the suffix Computed to clearly distinguish that this field is a computed field.

    ALTER FUNCTION [dbo].[udfEmpTime_TimeTotalComputed]

    (

    @TimeStart as DateTime,

    @TimeEnd as DateTime

    )

    RETURNS DECIMAL(8,6)

    AS

    BEGIN

    -- This function returns the time difference in hours - decimal(8,6)

    RETURN (round(isnull(CONVERT([decimal](8,6),@TimeEnd - @TimeStart,(0))*(24),(0)),(2)))

    END​

    Figure: This is the user defined function
    NormalizedFieldsDefine.jpg
    Figure: Sett​ing up computed column in table designer
  33. Schema - Do you add zs prefix to table name?

    Any type of table in a database where that does not contain application data should be called zs. So when the other application (e.g. SSW SQL Deploy) or the programmer populates the table then it should be called zs (e.g. zsDate - the program populates it, zsVersion - the programmer populates it).​​

  34. Views - Do you know to not have views as redundant objects?

    Don't have views as redundant objects. e.g. vwCustomers as SELECT * FROM Customers. This is unnecessary. Instead Views should be generally used for security.​​

  35. Stored Procedures - Do you return a value indicating the status?

    ​Make sure your stored procedures always return a value indicating the status. All stored procedures should return the error number (if an error) or a 0 to indicate no errors (ie success).​​

  36. Stored Procedures - Do you standardize on the return values of stored procedures for success and failures?

    Standardize on the return values of stored procedures for success and failures.​​

  37. Stored Procedures - Do you use OUTPUT parameters if you need to return the value of variables?

    The RETURN statement is meant for returning the execution status only, but not data. If you need to return value of variables, use OUTPUT parameters. There is a compelling reason for this - if you use return values rather than output values to return data, money values that you return will silently be truncated.​​

  38. Stored Procedures - Do you check the global variable @@ERROR after executing a data manipulation statement?

    Always check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default, SQL Server will not rollback all the previous changes within a transaction if a particular statement fails. This behaviour can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT variable also plays an important role in determining how many rows were affected by a previous data manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a particular transaction.​​

  39. Stored Procedures - Do you use SCOPE_IDENTITY() to get the most recent row identity?

    When inserting a row in a stored procedure, always use SCOPE_IDENTITY() if you want to get the ID of the row that was just inserted. A common error is to use @@IDENTITY, which returns the most recently created identity for your current connection, not necessarily the identity for the recently added row in a table. You could have a situation where there is a trigger that inserts a new record in a Logs Table, for example, when your Stored Procedure or INSERT SQL Statement inserts a record in the Orders Table. If you use @@IDENTITY to retrieve the identity of the new order, you will actually get the identity of the record added into the Log Table and not the Orders Table, which will create a nasty bug in your data access layer. To avoid the potential problems associated with someone adding a trigger later on, always use SCOPE_IDENTITY() to return the identity of the recently added row in your INSERT SQL Statement or Stored Procedure.

    ​Behold this example from SQL Server Books online.

    USE tempdb
    GO
    CREATE TABLE TZ (
    Z_id int IDENTITY(1,1)PRIMARY KEY,
    Z_name varchar(20) NOT NULL)
    INSERT TZ
    VALUES ('Lisa')
    INSERT TZ
    VALUES ('Mike')
    INSERT TZ
    VALUES ('Carla')
    SELECT * FROM TZ
    --Result set: This is how table TZ looks.
    Z_id Z_name
    -------------
    1 Lisa
    2 Mike
    3 Carla
    CREATE TABLE TY (
    Y_id int IDENTITY(100,5)PRIMARY KEY,
    Y_name varchar(20) NULL)
    INSERT TY (Y_name)
    VALUES ('boathouse')
    INSERT TY (Y_name)
    VALUES ('rocks')
    INSERT TY (Y_name)
    VALUES ('elevator')
    SELECT * FROM TY
    --Result set: This is how TY looks:
    Y_id Y_name
    ---------------
    100 boathouse
    105 rocks
    110 elevator
    /*Create the trigger that inserts a row in table TY
    when a row is inserted in table TZ*/
    CREATE TRIGGER Ztrig
    ON TZ
    FOR INSERT AS
    BEGIN
    INSERT TY VALUES ('')
    END
    /*FIRE the trigger and determine what identity values you obtain
    with the @@IDENTITY and SCOPE_IDENTITY functions.*/
    INSERT TZ VALUES ('Rosalie')
    SELECT SCOPE_IDENTITY() AS [SCOPE_IDENTITY]
    GO
    SELECT @@IDENTITY AS [@@IDENTITY]
    GO


    Notice the difference in the result sets. As you can see, it's crucial that you understand the difference between the 2 commands in order to get the correct ID of the row you just inserted.

    SCOPE_IDENTITY
    4
    /*SCOPE_IDENTITY returned the last identity value in the same scope. This was the insert on table TZ.*/
    @@IDENTITY
    115
    /*@@IDENTITY returned the last identity value inserted to TY by the trigger. This fired because of an earlier insert on TZ.*/​

  40. Stored Procedures - Do you SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes?

    ​You should use SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes (i.e. when you want the rowcounts to display as the messages from your T-SQL).​​

    According to SQL Server Books Online:
    "For stored procedures that contain several statements that do not return much actual data, this can provide a significant performance boost because network traffic is greatly reduced."

    Example: Procedure that returns a scalar value (ClientID generated by an insert statement) should use OUTPUT keyword (not RETURN) to pass back data. This is how you should return a generated ClientID from the procedure, and also return a status value

    CREATE PROCEDURE procClientInsert
    /*
    '----------------------------------------------
    ' Copyright 2001 SSW
    ' www.ssw.com.au All Rights Reserved.
    ' VERSION AUTHOR DATE COMMENT
    ' 1.0 DDK 17/12/2001
    '
    'Calling example
    'DECLARE @pintClientID int
    'DECLARE @intReturnValue int
    'exec @intReturnValue = procClientInsert 'TEST Entry',
    @pintClientID OUTPUT
    'PRINT @pintClientID
    'PRINT @intReturnValue
    '----------------------------------------------
    */
    @pstrCoName varchar (254),
    @pintClientID int OUTPUT
    AS
    --IF ONE THING FAILS, ROLLBACK
    SET XACT_ABORT ON
    --THE COUNT WILL NOT NORMALLY DISPLAY IN AN APPLICATION IN PRODUCTION.
    --GET RID OF IT BECAUSE IT IS EXTRA TRAFFIC, AND CAN CAUSE
    PROBLEMS WITH SOME CLIENTS
    SET NOCOUNT ON
    --Generate a random number
    SET @pintClientID = (SELECT CAST(RAND() * 100000000 AS int))
    INSERT INTO Client (ClientID, CoName) VALUES (@pintClientID ,
    @pstrCoName)
    SET XACT_ABORT OFF
    IF @@ROWCOUNT = 1
    RETURN 0 -- SUCCESS
    ELSE
    BEGIN
    IF @@ERROR=0
    RETURN 1 -- FAILURE
    ELSE
    RETURN @@ERROR -- FAILURE
    END
    SET NOCOUNT OFF


    This procedure will display 0 or the error to indicate success or failure. You should base you actions on this return code.

    This separates return values from actual data so that other programmers know what to expect.

    Note: If you are using SQL Server stored procedures to edit or delete data using a SqlDataAdapter, make sure that you do not use SET NOCOUNT ON in the stored procedure definition. This causes the rows affected count returned to be zero, which the DataAdapter interprets as a concurrency conflict. In this event, a DBConcurrencyException will be thrown.​

  41. Stored Procedures - Do you keep your Stored Procedures simple?

    If you are using the .NET Framework, put validation and defaults in the middle tier. The backend should have the required fields (Allow Nulls = False), but no complicated constraints. The following are examples that work with the Products table (with an added timestamp field called Concurrency) from Northwind.

    ​1. Code: Select Procedure​​​

    ALTER PROCEDURE dbo.ProductSelect
    @ProductID int
    AS
    SELECT ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock,
    UnitsOnOrder, ReorderLevel, Discontinued, Concurrency
    FROM Products
    WHERE (ProductID= @ProductID)

    2. ​​Code: Insert Procedure​

    ALTER PROCEDURE dbo.ProductInsert
    @ProductName nvarchar(40),
    @SupplierID int,
    @CategoryID int,
    @QuantityPerUnit nvarchar(20),
    @UnitPrice money,
    @UnitsInStock smallint,
    @UnitsOnOrder smallint,
    @ReorderLevel smallint,
    @Discontinued bit
    AS
    INSERT INTO Products (ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice,
    UnitsInStock, UnitsOnOrder, ReorderLevel, Discontinued)
    VALUES (@ProductName, @SupplierID, @CategoryID, @QuantityPerUnit, @UnitPrice, @UnitsInStock,
    @UnitsOnOrder, @ReorderLevel, @Discontinued, 1)
    SELECT Scope_Identity() AS [SCOPE_IDENTITY] --If table has identity column
    --SELECT @@ROWCOUNT --If table doesn't have identity column
    -- Note: The middle tier must check the ROWCOUNT = 1

    3.Code: Update Procedure​​

    ALTER PROCEDURE dbo.ProductUpdate
    @ProductID int,
    @ProductName nvarchar(40),
    @SupplierID int,
    @CategoryID int,
    @QuantityPerUnit nvarchar(20),
    @UnitPrice money,
    @UnitsInStock smallint,
    @UnitsOnOrder smallint,
    @ReorderLevel smallint,
    @Discontinued bit,
    @Concurrency timestamp
    UPDATE Products
    SET ProductName = @ProductName,
    SupplierID = @SupplierID,
    CategoryID = @CategoryID,
    QuantityPerUnit = @QuantityPerUnit,
    UnitPrice = @UnitPrice,
    UnitsInStock = @UnitsInStock,
    UnitsOnOrder = @UnitsOnOrder,
    ReorderLevel = @ReorderLevel,
    Discontinued = @Discontinued
    WHERE (Concurrency = @Concurrency) AND (ProductID= @ProductID) --Note the double criteria to ensure concurrency
    SELECT @@ROWCOUNT
    -- Note: The middle tier must check the ROWCOUNT = 1

    4.Code: Delete Procedure​

    ALTER PROCEDURE dbo.ProductDelete
    @ProductID int,
    @Concurrency timestamp
    AS
    DELETE FROM Products
    WHERE (ProductID= @ProductID) AND (Concurrency = @Concurrency)
    --Note the double criteria to ensure concurrency
    SELECT @@ROWCOUNT
    --Note: The middle tier must check the ROWCOUNT = 1​​

  42. Stored Procedures - Do not start user stored procedures with system prefix "sp_" or "dt_"?

    System stored procedures are created and stored in the master database and have the sp_ prefix. System stored procedures can be executed from any database without having to qualify the stored procedure name fully using the database name master. It is strongly recommended that you do not create any stored procedures using sp_ as a prefix. SQL Server always looks for a stored procedure beginning with sp_ in this order:

    1. The stored procedure in the master database.
    2. The stored procedure based on any qualifiers provided (database name or owner).
    3. The stored procedure using dbo as the owner, if one is not specified.

    Therefore, although the user-created stored procedure prefixed with sp_ may exist in the current database, the master database is always checked first, even if the stored procedure is qualified with the database name.

    Important: If any user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.

  43. Stored Procedures - Do you use company standard description in your stored procedures?

    All stored procedures must have Company Description.
  44. Stored Procedures - Do you avoid using SELECT * when inserting data?

    ​Using a statement like "INSERT tableName SELECT * FROM otherTable", makes your stored procedures vulnerable to failure. Once either of the two tables changs, your stored procedure won't work. Not only that, when the inserting table has an identity column, such a statement will cause an error - "An explicit value for the identity column in table ParaRight can only be specified when a column list is used and IDENTITY_INSERT is ON."

    ​USE [ParaGreg]
    GO
    /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    AS
    IF @direction = 0
    BEGIN
    INSERT INTO ParaRight
    SELECT * FROM ParaLeft
    WHERE ParaID = @id
    DELETE FROM ParaLeft
    WHERE ParaID = @id
    END
    ELSE IF @direction = 1
    BEGIN
    INSERT INTO ParaLeft
    SELECT * FROM ParaRight
    WHERE ParaID = @id
    DELETE FROM ParaRight
    WHERE ParaID = @id
    END

    ​Bad example: Using SELECT * when inserting data. Besides, this stored procedure should have an Else section to raise error when no condition is satisfied


    USE [ParaGreg]
    GO
    /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    AS
    IF @direction = 0
    BEGIN
    INSERT INTO ParaRight
    SELECT Col1,Col2 FROM ParaLeft
    WHERE ParaID = @id
    DELETE FROM ParaLeft
    WHERE ParaID = @id
    END
    ELSE IF @direction = 1
    BEGIN
    INSERT INTO ParaLeft
    SELECT * FROM ParaRight
    WHERE ParaID = @id
    DELETE FROM ParaRight
    WHERE ParaID = @id
    END
    ELSE BEGIN PRINT "Please use a correct direction"
    END

    Good example: Using concrete columns instead of * and provide an Else section to raise errors​​
  45. Stored Procedures - Do you use transactions for complicated stored procedures?

    ​A transaction means an atomic operation, it assures that all operations within the transaction are successful, if not, the transaction will cancel all operations and roll back to the original state of the database, that means no dirty data and mess exists in the database, so if a stored procedure has many steps, and each step has relation with other steps, it is strongly recommended that you encapsulate the procedure in a transaction.​

    ALTER PROCEDURE [dbo].[procInit]
    AS
    DELETE ParaLeft
    DELETE ParaRight
    INSERT INTO ParaLeft (ParaID)
    SELECT ParaID FROM Para

    Bad example: No tran​saction here, if any of operations fail, the database will only partially update, resulting in an unwanted result

    ALTER PROCEDURE [dbo].[procInit]
    AS
    BEGIN TRANSACTION
    DELETE ParaLeft
    DELETE ParaRight
    INSERT INTO ParaLeft (ParaID)
    SELECT ParaID FROM Para
    COMMIT

    Good example: Using a transaction to assure that all operations within the transaction will be successful, otherwise, the database will roll back to original state​
  46. Stored Procedures - Do you know SQL stored procedure names should be prefixed with the owner?

    ​Always specify the schema prefix when creating stored procedures. This way you know that it will always be dbo.procedure_name no matter who is logged in when it is created.

    There are 2 other benefits to including the schema prefix on all object references:

    1. This prevents the database engine from checking for an object under the users schema first
    2. Also avoids the issue where multiple plans are cached for the exact same statement/batch just because they were executed by users with different default schemas​.

    Aaron Bertrand agrees with this rule - My stored procedure "be​st practices" checklist.

    CREATE PROCEDURE procCustomer_Update @CustomerID INT, ….. BEGIN​

    Figure: Bad example​​

    ​​​CREATE PROCEDURE dbo.procCustomer_Update @CustomerID INT, ….. BEGIN

    Figure: Good example​​​​
  47. Relationships - Do you turn on referential integrity in relationships?

    Cascading referential integrity constraints allow you to define the actions SQL Server takes when a user attempts to delete or update a key to which existing foreign keys point. The REFERENCES clauses of the CREATE TABLE and ALTER TABLE statements support ON DELETE and ON UPDATE clauses:

    • [ ON DELETE { CASCADE | NO ACTION } ]
    • [ ON UPDATE { CASCADE | NO ACTION } ]

    NO ACTION is the default if ON DELETE or ON UPDATE is not specified.​​

    ​Relationships should always have referential integrity turned on. If you turned it on after data has been added, you may have data in your database that violates your referential integrity rules.

    ReferentialIntegrityCheck.jpg
    Figure: Recommended referential integrity constraints
  48. Relationships - Do you use ON UPDATE CASCADE clause when creating relationship?

    The ON UPDATE CASCADE feature of SQL Server 2000 and above can save you time in writing application and stored procedure code. We recommend that you take advantage of it. It is also more efficient than using triggers to perform these updates.​​moved 

  49. Relationships - Do you know to not use ON DELETE CASCADE clause when creating relationship?

    SQL Servers ON DELETE CASCADE functionality can be very dangerous. We recommend not using it. Imagine someone deletes customer and the orders are deleted. If you need to delete records in related tables, do it in code in the application as it gives you more control.​
  50. Relationships - Do you use NOT FOR REPLICATION clause when creating relationship?

    When NOT FOR REPLICATION is used with a Foreign Key relationship, the integrity of the relationship is not checked while the Replication Agent is logged in and performing replication operations. This allows changes to the data (such as cascading updates) be propagated correctly.
  51. Relationships - Do you have FOREIGN KEY constraints on columns ending with ID?

    Columns ending with 'ID' should have FOREIGN KEY constraints​.
    NorthwindRelationships.jpg
    Figure: Missing relationships​
  52. General - Do you know object name should not be a reserved word?

    SQL Server reserves certain keywords for its exclusive use. It is not legal to include the reserved keywords in a Transact-SQL statement in any location except that defined by SQL Server. No objects in the database should be given a name that matches a reserved keyword. If such a name exists, the object must always be referred to using delimited identifiers. Although this method does allow for objects whose names are reserved words, it is recommended that you do not name any database objects with a name that is the same as a reserved word. In addition, the SQL-92 standard implemented by Microsoft SQL Server defines a list of reserved keywords.​​​

    ​Avoid using SQL-92 reserved keywords for object names and identifiers, ie. User, Count, Group, etc. They can be used if joined with other words.
    What are reserved words for SQL Server 2000?
    Why avoid reserved words and spaces in object names?

  53. General - Do you know object name should not contain spaces?

    Spaces should be avoided. If an object name contains a space, it can cause problems later on for developers because the developer must remember to put the object name inside square brackets when referencing it.

    Why avoid reserved words and spaces in object names?

    ​We aim to never have to use square brackets in any of our databases.​​​​

  54. General - Do you know to not use "sp_rename" to rename objects?

    Do not use "sp_rename" to rename objects like stored procedures, views and triggers.​

    ​​​Object name should be the same as name used in the object's script (e.g. CREATE script for stored procedures, views and triggers). Inconsistency can happen when object is renamed with sp_rename, but its script is not updated.
  55. General - Do you know object name should follow your company Naming Conventions?

    1. SQL Server Object Naming Standard.aspx SSW's Standard for naming SQL Server Objects.
    2. SQL Server Stored Procedure Naming Standard SSW's Standard for naming Stored Procedures.
    3. SQL Server Indexes Naming Standard SSW's Standard for naming Indexes.
    4. SQL Server Relationship Naming Standard SSW's Standard for naming Relationships
    5. Use decreasing generality for table names ie. Client and ClientInvoice, then ClientInvoiceDetail.
    6. Don't use underscores, instead use upper and lower case ie. ClientInvoice is preferred over Client_Invoice.
    7. Table names should not use plurals ie. Client is preferred over Clients.
    8. Generally don't use abbreviations. But there are a few words that are so commonly used that they can be abbreviated. These are:
      • Quantity = Qty
      • Amount = Amt
      • Password = Pwd
    9. Prefix all Date fields with 'Date' ie. DateInvoiced. One extra use of this is you can have generic code that enables a Date control on this field.
    10. Suffix Percent fields with 'Pct' ie. SalesTaxPct.
    11. Only use alphabet characters. ie. don't use AustraliaListA$. Avoid the following characters in your object names in SQL Server. If you do not do this, you will need to constantly identify those ill-named objects with bracketed or quoted identifiers - otherwise, unintended bugs can arise. What characters and symbols should I avoid using when naming objects in Access and SQL Server databases
    12. Don't use reserved words on their own. ie. User, Count, Group, etc. They can be used if joined with other words. What are reserved words for SQL Server 2000?

  56. General - Do you know every object name should be owned by dbo?

    ​​The reason is that you avoid ownership chain problems. Where Mary owns an object, Fred can read the object and then he creates a proc and he gives permission to Tom to execute. But Tom cant because there is a product chain of ownership.​​​​

    CREATE PROCEDURE [Adam Cogan].[Sales by Year]

    @Beginning_Date DateTime,

    @Ending_Date DateTime AS

    SELECT Orders.ShippedDate

    ,Orders.OrderID

    ,"vwOrderSubTotals".Subtotal

    ,DATENAME(yy,ShippedDate) AS Year

    FROM Orders

    INNER JOIN "vwOrderSubTotals"

    ON Orders.OrderID = "vwOrderSubTotals".OrderID

    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

    Figure: Bad example​

    CREATE PROCEDURE [dbo].[Sales by Year]

    @Beginning_Date DateTime,

    @Ending_Date DateTime AS

    SELECT Orders.ShippedDate

    ,Orders.OrderID

    ,"vwOrderSubTotals".Subtotal

    ,DATENAME(yy,ShippedDate) AS Year

    FROM Orders

    INNER JOIN "vwOrderSubTotals"

    ON Orders.OrderID = "vwOrderSubTotals".OrderID

    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

    Figure: Good example​​​


  57. General - Do you know the naming convention for use on database server test and production?

    Generally, every client should have a dev and a test database, so the dev database need to have the postfix "Dev" and the test database need to have the postfix "Test"(E.g. SSWCRMDev, SSWCRMTest). However, you don't need any postfix for production database.​​​

    BadDBName.gif
    Figure: Database with bad names
    GoodDBName.gif
    Figure: Database with standard names

  58. Middle Tier - Do you submit all dates to SQL Server in ISO format?

    All dates submitted to SQL Server must be in ISO format date. This ensures that language or database settings do not interfere with inserts and updates of data. You should NEVER need to change the default language of users or of the database in SQL Server. For example, any insert into a SQL Server database with Visual Basic should call Format(ctlStartDate,"yyyy-mm-dd") or VB.NET Ctype(ctlStartDate.Text,Date).ToString("yyyy-MM-dd") before attempting the insert or update. This will ensure consistency of treatment when dealing with dates in your SQL Server backend.​​

    ​SET DATEFORMAT mdy

    print convert( datetime, '2003-07-01' )

    -- returns Jul 1 2003 12:00AM

    print convert( datetime, '01/07/2003' )

    -- returns Jan 7 2003 12:00AM

    print convert( datetime, '20030701' )

    -- returns Jul 1 2003 12:00AM

    SET DATEFORMAT dmy

    print convert( datetime, '2003-07-01' )

    -- returns Jan 7 2003 12:00AM, opposite of above

    print convert( datetime, '01/07/2003' )

    -- returns Jul 1 2003 12:00AM, opposite of above

    print convert( datetime, '20030701' )

    -- returns Jul 1 2003 12:00AM, only one which is same as above

    ​Code - ISO format date is the best.​


    The extended format can still mix up month & day in some circumstances, whereas the basic format is the only one that always works correctly.

    To be even more pedantic, when you include the time as well as the date, the value isn't really an ISO value at all! The ISO representation of a date/time would be '20030701T0958', whereas for SQL you should send it as '20030701 09:58'. This isn't even the extended ISO format as it is missing the obligatory "T" character (ref. section 5.4.1 of the standard).

    (The standard does allow you to "be omitted in applications where there is no risk of confusing", but it doesn't allow you to add a space or mix basic date with extended time.)

    So, if you want to be absolutely correct then it may be best to remove the reference to ISO, so that your rule works for date/time as well as just dates.

    The technical term used in the SQL help is "Unseparated String Format" (easily searched for).

    The help specifies that this format is unaffected by the SET DATEFORMAT command (which depends on any locale settings for SQL Server or the computer it is installed on).

    "The SET DATEFORMAT session setting does not apply to all-numeric date entries (numeric entries without separators). Six- or eight-digit strings are always interpreted as ymd."

    ​What is ISO format date? ​

  59. Middle Tier - Do you implement business logic in middle tier?

    Business logic/rules should be implemented in an object oriented language such as VB.NET and C#.  This dramatically increases the adaptability, extensibility and maintainability of the application.

    Implementing business logic in stored procedures have the disadvantage of being hard to test, debug and evolve, therefore, they should only implement basic data access logic.

    With the exception of some very heavy data oriented operations, it is excusable to use stored procedures to carry out some logic for performance reasons.

    Triggers are even more difficult as their behaviour is event based.  It is okay to use triggers for non-functional/infrastructural features such as logging changes, or maintain more complex relational integrity which cannot be enforced by a simple relationship.​​