Rules To Better SQL Databases - Developers

​​​​​Here are some of the typical things that all SQL Server DBAs and Database developers should know. These rules are above and beyond the most basic textbook recommendations of:

    ​​​​   Ensuring your databases are Normalized and in 3rd Normal Form 
      ​​​​▪   Making sure you have primary keys, foreign keys and simple indexes to improve performance 
        ​​​​▪   Making sure you Back up regularly 
          ​​​​▪   Basic Naming conventions (see some of our object naming conventions)
            ​​​​▪  ​ Minimizing result set​ sizes and data over the wire​​​​​​​
            ​​​
            Hold on a second! How would you like to view this content?
            Just the title! A brief blurb! Gimme everything!
            1. Data - Do you not allow Nulls in text fields?

              ​​NULLs complicate your life. To avoid having to constantly differentiate between empty strings and NULLs, you should avoid storing NULLS if you can.

              Why? Well, what is wrong with this?​​​

              ​SELECT ContactName FROM Customer WHERE ContactName <> ''

              Figure: Selecting on empty string​

              Nothing if your data is perfect, but if you allow Nulls in your database, then statements like this will give you unexpected results. To get it working you would have to add the following to the last line:

              WHERE ContactName <> '' OR ContactName Is Null

              Figure: Allowing null strings makes queries more complex

              What about only allowing empty strings? Well, we choose to block Nulls because it is a lot easier to check off a check box in SQL Server Management Studio than it is to put a constraint on every field that disallows empty string ('').​


              Figure: Don't allow Nulls​

              However, you should always be aware that Nulls and empty strings are totally different, so if you absolutely have to have them, they should be used consistently. In the ANSI SQL-92 standard, an empty string ('') is never equated to Null, because empty string can be significant in certain applications. 

              Not allowing Nulls will give you the following benefits: 

              • ​​​​​Don't have to enforce every text field with a CHECK constraint such as ([ContactName]<>'').
              • Make your query simpler, avoid extra checking in stored procedures. So you don't have to check for NULLs and empty strings in your WHERE clause.
              • SQL Server performs better when nulls are not being used.
              • Don't have to deal with the pain in the middle tier to explicitly check DBNull.Value, you can always use contactRow.ContactName == String.Empty. Database Nulls in the .NET framework are represented as DBNull.Value and it cannot implicitly typecast to ANY other type, so if you are allowing NULLs in ContactName field, the above comparing will raise an exception.
              • Avoid other nasty issues, a lot of controls in the .NET framework have real problems binding to DBNull.Value. So you don't have write custom controls to handle this small thing.

              For example, you have Address1 and Address2 in your database, a Null value in Address2 means you don't know what the Address2 is, but an empty string means you know there is no data for Address2. You have to use a checkbox on the UI to explicitly distinguish Null value and empty string:​

              NullValueOnUI.jpg
              ​Figure: A check box is required if you want to allow user to use Null value on the UI

              Some people are not going to like this rule, but this is how it works in Oracle and Access:

              • In Oracle, empty strings are turned into Nulls (which is basically what this rule is doing). Empty strings per se are not supported in Oracle (This is not ANSI compliant).
              • And talking of legacy systems :-) be aware that using Access as a data editor is a "No-No". Access turns empty strings into a Null.

              Finally, always listen to the client, Nulls have meaning over an empty string - there are exceptions where you might use them - but they are rare.

              So follow this rule, block Nulls where possible, update your NULLs with proper information as soon as possible, and keep data consistent and queries simple.

            2. Data - Do you not allow NULLs in number fields if it has the same meaning as zero?

              ​NULLs create difficulty in the middle-tier because you need to add further handling. So avoid them where you can, eg. For a Discount field, make the default 0 and don't allow NULLs.

              ​This rule should not be applied when a NULL value is valid data. Often times data such as a percent earnings rate on a super fund is nullable because it may not be supplied or relevant. This is very different to it being zero and you have no way to determine real zero values from not supplied data. The hit of doing the work in code is often offset in this case by the validity of query results.

              As a general rule of thumb, don't use NULL if you cannot distinguish it from another value.

              Q: What is the difference between NULL and 0 in the field "Discount"?
              A: No difference, so don't allow NULL.

              Q: What is the difference between NULL and 0 in the field "DailySugarIntake"?
              A: NULL ​means unknown and 0 means no daily sugar intake, so allow NULL.

              Note: Nulls are evil, but don't go crazy removing nulls. Never invent your own constant eg. -999 to represent a Null.

            3. Data – Do you avoid spaces and empty lines at the start of character columns?

              ​​​​​​Text in character columns (char, varchar, text, nchar, varchar, text) can start with spaces or empty lines which is usually data entry error.​ 

              The best way to avoid this issue is to handle whitespace in the middle-tier before it reaches the database.​​

              Here’s an example of removing whitespace and carriage returns in the middle-tier using Regex:​

              static string Trim(string inputText)
              {
                Match m = Regex.Match(inputText, @"[^\s]");
                return m.Success ? inputText.Substring(m.Index) : inputText;
              }

              ​​Figure: C# Removing whitespace and carriage returns in middle-tier

              The code above:

              • Uses Regular Expressions (Regex) to match the first non-whitespace character (includes tabs, spaces, line feeds and carriage returns).
              • Retrieves the index of the character
              • Returns the text from the character onwards, thus removing the whitespace at the start
              This code could be triggered in the middle-tier before inserting into the database.​

            4. Data - Do you use Identities in SQL Server?

              ​​​​​This one is going to be a controversial one. But the bottom line is every now and then you want to do something and then you curse and wish your database didn't have identities. So why use them? Let's look at the problems first:​​​

              ​Con​s:
              • You can't manually change a Primary Key and let the Cascade Update do its work, eg. an InvoiceID
              • Hassles when importing data into related tables where you want to control the Primary Key eg. Order and Order Details
              • Replication you will get conflicts

              ​In Microsoft Access you have autonumbers and there is no way around them so never use them.
              But in SQL Server you have identities and we have these procs:

              • DBCC CHECKIDENT - Checks the current identity value for the specified table and, if needed, corrects the identity value
              • SET IDENTITY_INSERT { table } { ON | OFF } - Allows explicit values to be inserted into the identity column of a table
              Pros:​​
              • Less programming - letting the database take care of it
              • Replication (identities are supported by SQL Server with ranges so when you want replication, no coding)
              • Avoiding concurrency errors on high INSERT systems so no coding

              So the only Con left is the importing of data but we can use one of the above procs to get around it. See grey box.

              The best way to import data into SQL Server (with Identities)​

              Using SQL Management Studio

              1. Right-Click your database to open the menu
              2. Navigate to Tasks | Import Data…  to open the wizard
              3. When selecting Source Tables and Views click on Edit Mappings…

              ​Figure: SQL Import Wizard - Edit Mappings

                 4. Ensure the Enable identity insert is checked

              Figure: SQL Import Wizard – Ensure Enable identity insert is checked.

              Alternatively, you can also enable and disable the identity insert through SQL with the following commands

              ​SET IDENTITY_INSERT Shippers ON --this will allow manual identity INSERTS on the requested table
               
              -- Modify the table here
               
              SET IDENTITY_INSERT Shippers OFF --as it can only be on for one table at a time​

              ​​More information on ​IDENTITY_INSERT

              Automatic Identity Range Handling​

              The simplest way of handling identity ranges across replicas is to allow SQL Server ​to manage identity range handling for you. To use automatic identity range handling, you must first enable the feature at the time the publication is created, assign a set of initial Publisher and Subscriber identity range values, and then assign a threshold value that determines when a new identity range is created.
              For example, assigning an identity range from 1000 through 2000 to a Publisher, and a range from 2001 through 3000 to an initial Subscriber a range from 3001 to 4000 to the next publisher etc.​

            5. Data – Do you avoid deleting records by flagging them as IsDeleted (aka Soft Delete)?

              ​​​​When users are deleting a lot of records as part of normal operations - they can and do make mistakes. Instead of the painful process of having to go to a backup to get these records, why not simply flag the records as IsDeleted?​​

              Advant​ages
              • You do not have to delete all related records e.g. Customers, Orders, Order Details. Instead, you can just flag the parent record as deleted with an "IsDeleted" bit field.
              • You do not lose historical data e.g. how many products one of your previous clients purchased.
              • You can actually see who deleted the record, as your standard audit columns (e.g. DateUpdated, UserUpdated are still there. The record does not just vanish.
              • It is simple to implement - particularly when using a code generator. For example - our code generator produces views and stored procedures, and all data access layer code. With all data access done through the data layer views, we simply had to add a filter to all views ("WHERE IsDeleted = 0"). Our autogenerated delete stored procedures simply set the "IsDeleted" column to false.
              Disa​dvantages
              • Depending on your interface design, you may have to join to parent tables to ensure that deleted child records do not appear. Typically, the interface would be designed in such a way that you would not need be able to created new records based on the deleted items (e.g. you cannot create a new order record for a customer that is deleted). Performance of queries can potentially suffer if you have to do these joins.
              • While storage space is very cheap, you are not removing records from your database. You may need to archive records if the number of deleted records becomes large.

              Also see Using A​udit To​ols for alternatives to this approach using 3rd party auditing tools.​

              Watch William Liebenberg's SpendOps talk for more details about why soft deletes are advantageous in Azure:

            6. Data - Dates - Do you make sure you have valid date data in your database?

              ​SQL Server dates can range from the year 1900 up to the year 9999. However, certain date data in your database just wouldn't make any sense in the context of your business. For example, if your company started trading in 2015 you should not have any dates in your database before 2015 (unless you are tracking start dates of your clients, but this is an exception). An invoice date of 2013 wouldn't make sense at all. 

              There are two methods to avoid this:

              • ​Using Validation Queries

              You can run validation queries to ensure no rubbish date data gets into your database.

              • ​Using Constraints
              Alternatively, you can use Constraints to limit the date range from your own earliest specified date.
               
              Here’s an example of implementing a date range constraint.

              ​CONSTRAINT chk_INVOICE_DATE CHECK (INVOICE_DATE > TO_DATE('2015-01-01', 'yyyy-mm-dd'))​


            7. Data - Dates - Do you know DateTime fields must be converted to universal time?

              ​​​​Any DateTime fields must be converted to universal time from the application to the stored procedures when storing data into the database.

              We can simplify dealing with datetime conversions by using a date and time API such as Noda TIme.

              Noda Time uses the concept of an Instant representing a global point in time, which is first converted to UTC time and then to the users local time when required for display.​
              An Instant is the number of nanoseconds since January 1st 1970. Using an Instant gives more granularity than datetime because it uses nanoseconds rather than ticks (100 nanoseconds).​​

              //------ .NET DateTime Examples
              int year, month, day;
              int hour, minute, second;

              long nowInTicks = DateTime.Now.Ticks;             //      637158251390332189
              DateTime now = DateTime.Now;                    
              DateTime nowUtc = DateTime.UtcNow;
              DateTime date = new DateTime(2020, 1, 2);        //      2/01/2020 12:00:00 AM
              TimeSpan time = new TimeSpan(16, 20, 0);        //      16:20:00
              DateTime dateTime = date + time;                     //      2/01/2020 4:20:00 PM

              date = dateTime.Date;
              time = dateTime.TimeOfDay;
              year = date.Year;
              month = date.Month;
              day = date.Day;
              hour = time.Hours;
              minute = time.Minutes;
              second = time.Seconds;

              int startDate = (int)date.DayOfWeek;
              int target = (int)DayOfWeek.Wednesday;
              if (target <= startDate)
                  target += 7;
              DateTime nextWednesday = date.AddDays(target - startDate);     //     8/01/2020 12:00:00 AM

              startDate = (int)date.DayOfWeek;
              target = (int)DayOfWeek.Friday;
              if (target > startDate)
                  target -= 7;
              DateTime lastFriday = date.AddDays(-(startDate - target));         //     27/12/2019 12:00:00 AM

              TimeSpan t1 = TimeSpan.FromDays(1.0);
              TimeSpan t2 = TimeSpan.FromHours(1.0);

              int timespanCheck = TimeSpan.Compare(t1, t2);
              TimeSpan longestSpan;
              TimeSpan shortestSpan;
              if(timespanCheck > 0)
              {
                  longestSpan = t1;
                  shortestSpan = t2;

              else if(timespanCheck < 0)
              {
                  shortestSpan = t1;
                  longestSpan = t2;
              }

              Figure: Bad Example - ​​Using .Net DateTime to manipulate dates and times.

              //------    Noda Time Examples
              int year, month, day;
              int hour, minute, second;

              Instant nowAsInstant = SystemClock.Instance.GetCurrentInstant(); //   2020-01-28T05:18:26Z

              DateTimeZone zone = DateTimeZoneProviders.Tzdb["Australia/Melbourne"];
              ZonedClock utcClock = SystemClock.Instance.InUtc();
              ZonedClock localClock = SystemClock.Instance.InZone(zone);
              LocalDate ntDate = new LocalDate(2020, 1, 2);   //      Thursday, 2 January 2020
              LocalTime ntTime = new LocalTime(16, 20);       //      4:20:00 PM
              LocalDateTime ntdateTime = ntDate.At(ntTime);   //      2/01/2020 4:20:00 PM

              ntdateTime.Deconstruct(out ntDate, out ntTime);
              ntDate.Deconstruct(out year, out month, out day);
              ntTime.Deconstruct(out hour, out minute, out second);

              LocalDate ntNextWednesday = ntDate.Next(IsoDayOfWeek.Wednesday); //    Wednesday, 8 January 2020
              LocalDate ntLastFriday = ntDate.Previous(IsoDayOfWeek.Friday);   //    Friday, 27 December 2019

              Duration d1 = Duration.FromDays(1);
              Duration d2 = Duration.FromHours(1);
              Duration longestDuration = Duration.Max(d1, d2);
              Duration shortestDuration = Duration.Min(d1, d2); 

              Figure: Good Example - Using Noda Time to manipulate dates and times.

              When retrieving data from the database it must be converted back to the local time of the user.

              That way you get an accurate representation of the time someone entered data into the database (i.e. the DateUpdated field).
              The exception to this rule, however, is for already existing databases that deal with DateTime as part of their queries.
              e.g. SSW Time PRO.NET is an application that allows employees to enter their timesheet. The table used for storing this information has an important field that has a DateTime data type.

              This cannot be converted to UTC in the database because that would mean:

              1. Converting every single entry since entries began being stored (in SSW's case since 1996) to keep information consistent;
              2. Other separate applications currently using the timesheet information in the database for reporting will also have to be entirely modified.

              Currently, there will be an issue if for example, someone from the US (Pacific time) has 19 hours difference between her local time and our servers.​​​​​

              Example: Sally in the US enters a timesheet for the 21/04/05. (which will default to have a time of 12:00:00 AM since the time was not specified)
              Our servers will store it as 21/04/05 19:00:00 in other words 21/04/05 07:00:00 PM because the .NET Framework will automatically convert the time accordingly for our Web Service.
              Therefore our servers have to take the Date component of the DateTime and add the Time component as 12:00:00 AM to make it stored in our local time format.

              [WebMethod]
              public double GetDateDifference(DateTime dateRemote)
              {
              DateTime dateLocal = dateRemote.Date;
              ​​return (dateRemote.TimeOfDay.TotalHours - ​​dateLocal.TimeOfDay.TotalHours);
              ​}

              Figure: When dateRemote is passed in from the remote machine, .Net Framework will have already converted it to the UTC equivalent for the local server (i.e. the necessary hours would have been added to cater for the local server time).

              In the above code snippet, the .Date property would cut off the Time portion of the DateTime variable and set the Time portion to "12:00:00 AM" as default.

              This is for applications we currently have that:

              1. Consider the DateTime component integral for the implementation of the application.
              2. That will be used world-wide.


            8. Data - Do you use temporal tables to audit data changes?

              ​​​In many cases, there are le​gal requirements to audit all updates to financial records. In other cases, you will want to be able to track and undo deletes to your database. With the use of Temporal tables, this becomes much easier to manage.

              ​Temporal tables​ were introduced in SQL Server 2016 and enhanced with increased features in SQL Server 2017.
              They offer the ability to record all the entity changes to a history table allowing the querying of the entity at a point in time.​

              ​​Pros:

              • You can query values of a specific entity at a particular point in time or time range over its lifetime.
              • Restore accidentally deleted records by retrieving them from the history table.
              • Retention period can be set on the history table, this can be set as frequent as 1 day.
               

              Co​​ns:

              • History tables can grow very quickly in size.​
              • Storing blob datatypes (nvarchar(max), varbinary(max), ntext and image) can increase storage costs and decrease performance.
              • You cannot truncate the table.
              • Temporal and history table cannot be FILETABLE.
              • Direct modification of the data in the history is not permitted.
               

              How do I create a Temporal table?

               It’s actually quite simple, here is a code snippet converting a table from the Northwind schema into a temporal table.

              ​​CREATE TABLE dbo.Shippers
              (
                             [ShipperID] int IDENTITY(1, 1) NOT NULL,
                             [CompanyName] nvarchar(40) NOT NULL,
                             [Phone] nvarchar(24) NULL,
                             [SysStartTime] datetime2 GENERATED ALWAYS AS ROW START,
                             [SysEndTime]  datetime2 GENERATED ALWAYS AS ROW END,
                             PERIOD FOR SYSTEM_TIME (SysStartTime, SysEndTime),
                             CONSTRAINT PK_Shippers PRIMARY KEY CLUSTERED
                             (             
                                            [ShipperID]
                             )
              )
              WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ShippersHistory));

              Figure: Shippers table from the Northwind schema converted to a temporal table.​

              Figure: New temporal table shown in SQL Management Studio.

              -- Update the tables history data retention
              ALTER TABLE dbo.Shippers
              SET (SYSTEM_VERSIONING = ON (HISTORY_RETENTION_PERIOD = 7 YEARS));​

              Figure: Code snippet for updating data retention.
              Some alternative solutions are:
              1. Manually adding triggers on all database tables to log every table
              2. The business objects or stored procedures all write to 2 tables the main table such as Customer and CustomerAudit
              3. Using a logging utility to audit database changes

              This means that you can devote your development time to areas other than auditing. Also, unlike other utilities which use triggers (such as ApexSQL Audit), there is no performance overhead because it relies upon log files already created by SQL Server. If required, you can export the log information to SQL Server, so you can perform advanced queries on it. It even allows you to recover previously deleted tables.​

            9. Data - Do you avoid invalid characters in object identifiers?

              ​We believe it is not good that use invalid characters (most of are Symbol characters, like ",;"\/(", etc.) in object identifiers. Though it is legal, it is easily confused and probably cause an error during run script on these objects.​

            10. Do you have a general Contact Detail table?

              It is common to have a Contact Detail table to store your contact information such as phone numbers. Below is an example of a Contact Detail table and its related tables. This is bad because the PartyPhone table is too specific for a phone number and you have to add a new table to save an email or other contact information if this is needed in the future.​

              ContactDetailTable_bad.png
              Figure: Bad Example - A too specific Contact Detail table

              We normally have a general Contact Detail table that includes all the different categories of phone numbers, whether it is shared or primary plus emails all in the same table.

              ContactDetailTable_good.png
              Figure: Good Example - A general Contact Detail table

              We use a Contact Detail Category table to store these categories.

              ContactDetailCategoryTable.png
              Figure: Good Example - Details of Contact Detail Category table
            11. Data - Do you use a URL instead of an image in your database?

              ​​​​We recommend that you use a URL instead of an image in your database, this will make you:

              • Avoid the size of your database increasing too quickly (which may bring a serial of problems, like performance, log and disk space, etc);
              • Easy to validate and change the image​​​.



            12. Schema - Do you only use Unicode datatypes (nchar, nvarchar and ntext) in special circumstances?

              Columns defined using the nchar and nvarchar datatypes can store any character defined by the Unicode Standard, which includes all of the characters defined in the various English and Non-English character sets. These datatypes take twice as much storage space per characters as non-Unicode data types.

              ​It is not the disk space costs that are the concern. It is the 8060 limit, please refer to Maximum Capacity Specifications for SQL Server​ for details.

              ​If your database stores only English characters, this is a waste of space. Don't use Unicode double-byte datatypes such as nchar and nvarchar unless you are doing multilingual applications.

              If you need to store more that 255 Characters, use Varchar(max) or nvarchar(max).

            13. Schema - Do you always use Varchar?

              ​​​Use VARCHAR instead of CHAR, unless your data is almost always of a fixed length, or is very short. For example, a Social Security/Tax File number which is always 9 characters. These situations are rare. 

              SQL Server fits a whole row on a single page, and will never try to save space by splitting a row across two pages. 

              Running DBCC SHOWCONTIG against tables shows that a table with fixed length columns takes up less pages of storage space to store rows of data. 

              General rule is that the shorter the row length, the more rows you will fit on a page, and the smaller a table will be. 

              It allows you to save disk space and it means that any retrieval operation such as SELECT COUNT(*) FROM, runs much quicker against the smaller table.​​

            14. Schema - Do you have standard Tables and Columns?

              ​​1. All tables should have the following fields:
              FieldSQL Server Field Properties
              CreatedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
              CreatedUserIdForeign Key to Users table, Allow Nulls=False
              ModifiedUtcdatetime2 Allow Nulls=False Default=GETUTCDATE()
              ModifiedUserIdForeign Key to Users table, Allow Nulls=False
              Concurrencyrowversion Allow Nulls=False

              ​The first three are examples of bad table records. The last one is an example of how this table structure should be entered.

              imgGoodBadPracticesExampleSQLFields.png
              Figure: 3 bad examples and 1 good example of Row auditing

              Note #1: Never set the CreatedUtc field - instead use a default GETUTCDATE()
              Note #2: These fields offer basic row auditing that will cover the majority of applications. When an application has specific auditing requirements, they should be analysed to see if this approach is sufficient.

              2. All databases should have a table with one record to store application Defaults. This table should be called 'Control'.
              If the settings are not application-wide, but just for that user then an XML (do not use an INI file) for simple stuff might be better. Examples are saving the 'User' for logon, 'Select Date Range' for a report, form positions, etc.

              .NET programs have an Application.Configuration which exports to XML file (app.config) automatically. It works very well, and deployment is very simple. It's integrated right into the Visual Studio.NET designer as well.

              3. All databases should have a version table to record structural changes to tables. See SSW Rules to Better Code​
               
              4. Lookup tables that have just two columns should be consistent and follow this convention: CategoryId (int) and CategoryName (varchar(100)).

              The benefit is that a generic lookup form can be used. You will just need the generic lookup form pass in the TableName and Column1 and Column2.

              Note #3: The problem with the naming is the primary keys don't match
              Note #4: The benefit with the character primary key columns is that queries and query strings have meaning Eg. http://www.ssw.com.au/ssw/Download/Download.aspx?GroupCategoryID=5BUS from this URL I can guess that it is in the business category.

            15. Schema - Do you use Bit/Numeric data type correctly?

              ​​​​1. Bit data type

              Bit data from 0 to 1 (2 values only). Storage size is 1 byte.

              Columns of type bit cannot have indexes on them.​

              ​​Columns of type bit should be prefixed with "Is" or a "Should" ie. IsInvoiceSent (y/n) or ShouldInvoiceBeSent (y/n) you can tell easily which way the boolean is directed. [more information on naming conve​ntions​]

              This being said, fields of this type should generally be avoided because often a field like this can contain a date i.e. DateInvoiceSent (Date/Time) is prefered over InvoiceSent (y/n). If a date is inappropriate then we still recommend an int field over a bit field anyway, because bits are a pain in the butt :-)

              2. ​Tinyint data type

              Integer data from 0 through 255. Storage size is 1 byte.

              3. Smallint data type

              Integer data from -2^15 (-32,768) through 2^15-1 (32,767). Storage size is 2 bytes.

              4. Int data type

              Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31-1 (2,147,483,647). Storage size is 4 bytes. The SQL-92 synonym for int is integer.

              5. Bigint data type

              Integer (whole number) data from -2^63 (-9223372036854775808) through 2^63-1 (9223372036854775807). Storage size is 8 bytes.​​

              Recommended:

              • Use smallint datatype instead of bit datatype - so it can be indexed;
              • Use int datatype, where possible, instead of bigint datatype - for saving disk space;
              • Use smallint datatype, where possible, instead of int datatype - for saving disk space;
              • Use tinyint datatype, where possible, instead of smallint datatype - for saving disk space;
            16. Schema - Do you use Natural or Surrogate primary keys?

              Now, this is a controversial one. Which one do you use?

              1. A "Natural" (or "Intelligent") key is actual data
                • Surname, FirstName, DateOfBirth
              2. An "Acquired Surrogate" (or "Artificial​" or "System Generated") key is NOT derived from data eg. Autonumber
                • eg. ClientID 1234
                • eg. ClientID JSKDYF
                • eg. ReceiptID 1234
              3. A "Derived Surrogate" (or "User Provided") key is indirectly derived from data eg. Autonumber
                • eg. ClientID SSW (for SSW)
                • eg. EmpID AJC (for Adam Jon Cogan)
                • eg. ProdID CA (for Code Auditor)
              4. A "GUID" key automatically generated by SQL Server​

              The problems with Natural Keys:

              • Because they have a business meaning, if that meaning changes (eg. they change their surname), then that value NEEDS to change. Changing a value with data is a little hard - but a lot easier with Cascade Update.
              • The main problem is that the key is large and combined and this needs to be used in all joins

              The Problem with Acquired Surrogate Keys:

              • A surrogate key has no meaning to a user
              • It always requires a join when browsing a child table eg. The InvoiceDetail table

              The Problem with Derived Surrogate

              • The user needs to enter a unique value
              • Because they have a business meaning, if that meaning changes (eg. they change their company name), then that value MAY NEED to change. Changing a value with data is a little hard - but a lot easier with Cascade Update
              • More likely to have a problem with Merge Replication

              The Problem with GUID key

              We like GUID keys. However, GUID generation produces essentially random numbers which cannot realistically be used as primary keys since new rows are inserted into the table at random positions leading to extremely slow inserts as the table grows to a moderate size. Inserting into the middle of a table with a clustered index, rather than appending to the end can potentially cause the database to have to move large portions of the data to accommodate space for the insert. This can be very slow.

              Recommendations

              1. We do not use Natural keys ever
              2. We use Acquired Surrogate for some tables
                • eg. Invoice table
                • eg. Receipt table
              3. a combination of Acquired Surrogate and Derived Surrogate for other tables
                • eg. Customer table
                • eg. Employee table
                • eg. Product table
              When we say combination because if the user doesn't enter a value then we put a random value in (by a middle-​tier function, so it works with Access or SQL). eg. ClientID JSKDYF

              The user can then change the value to anything else and we validate it is not used, and then perform a cascade update - or if it is more then 3 levels deep we execute a stored proc. Unfortunately, this is a complicated proc that cycles through all related tables and performs an UPDATE. Here is an example.

              The Derived Surrogate has the benefit of being easy for people to remember and can be used in the interface or even the query string

              Over the years experience has lead me to the opinion that the natural vs surrogate key argument comes down to a style issue. If a client or employer has a standard one way or another, fine use it. If not, us​e whichever you method you prefer, recognizing that there may be some annoyances you face down the road. But don't let somebody criticize you because your style doesn't fit his preconceived notions.​

            17. Schema - Do you create primary key on your tables?

              ​​When you specify a PRIMARY KEY constraint for a table, SQL Server enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries.

              Although, strictly speaking, the primary key is not essential - we recommend all tables have a primary key (except tables that have a high volume of continuous transactions). 

              Figure: Bad Example - Table missing primarykey
              Figure: Good Example - Table with primary key

              Legacy: 

              Especially, when you have a client like Access, it would help you to avoid the problems.​​

            18. Schema - Do you create clustered index on your tables?

              ​​You're allowed one clustered index per table, so unless you are never going to query a table, you may as well choose a field to be part of a clustered index. Basically,
              1. Every table should have a clustered index;
              2. The clustered index should be a unique clustered index where possible;
              3. The clustered index should be on a single column where possible;​

              So how do you choose the right field? Depending on the usage pattern of a table, clustered indices should be created. If sets of related records are regularly retrieved from a table in an application, a clustered index could dramatically improve performance.

              For example, in an Order to OrderDetails relationship with OrderID as the joining key, items in an order are regularly retrieved in a bundle. A clustered index on the OrderID column in the OrderDetails table will improve the performance of the application significantly.

              Another example, if a table is frequently used for reporting, and a date range is used to define the time scope of the report, a clustered index on the date column is suitable. In more technical terms, if queries such as...

              SELECT * FROM ReportTable WHERE ItemDate BETWEEN 1/1/2003 AND 1/2/2003

              ...is executed frequently, ItemDate is a good candidate column for a clustered index.​

            19. Schema – Do you avoid using indexes on RowGuid column?

              ​​RowGuids (uniqueidentifier) are large fields (16 bytes) and are basically going to ALWAYS​ be unique.​

              SQL Server adds a RowGUID column to all tables if you are using Merge Replication (but doesn't add an index).​​

              RowGuids in general slow things down. Some people may consider using a RowGuid as their primary key. This is a bad idea because the index is going to be quite slow.... you are searching a large field. It goes without saying, NEVER have clustered index on a RowGuid column.​​

              ​Another little annoyance with RowGuids is when you are searching for one. You can't use > or < on a RowGuid column.

              ​Note: There are not many cases where a RowGuid should have an index on it. 

              Be aware that SQL server adds this column when you perform merge replication. There are not many cases where this should have an index on it.

            20. Schema - Do you have a rowversion column?

              ​​​​SQL Server rowversions are a data type available which are binary numbers that indicate the relative sequence in which data modifications took place in a database.​ See the MSDN article on rowversions here: rowversion (Transact-SQL)​

              ​​All tables should have a rowversion column called "RecordVersion" to aid concurrency checking. A rowversion improves update performance because only one column needs to be checked when performing a concurrency check (instead of checking all columns in a table for changes).​​

              Figure: Bad Example - No rowversion available in this table​


              ​CREATE TABLE MyTest (myKey int PRIMARY KEY 
                  ,myValue int, RecordVersion rowversion); 
              GO
               
              INSERT INTO MyTest (myKey, myValue) VALUES (1, 0);  
              INSERT INTO MyTest (myKey, myValue) VALUES (2, 0); 
              INSERT INTO MyTest (myKey, myValue) VALUES (3, 0); 
              UPDATE MyTest SET myValue = 1 WHERE myKey = 2
               
              SELECT * FROM MyTest ORDER BY RecordVersion DESC

              ​​Figure: Good Example - A create statement which builds a table with a rowversion


              ​Figure: Good Example - A set of records with a rowversion available


            21. Schema - Do you use FillFactor of 90% for indexes and constraints?

              ​​​Indexes should generally have a fillfactor of 90%. If the amount of data stored in the database does not prohibit rebuilding indexes, a fillfactor of 90% should be maintained to increase the performance of inserts.​

              ​A table that expects a lot of insert operations could use a lower fillfactor.​​

            22. Schema - Do you always have version tracking tables?

              ​We always use two tables for tracking versioning information:

              • _zsDataVersion tracks the schema changes, and which update script we are up to. This helps tremendously in determining which version of the scripts are still required between development, test, and production databases.
              • _zsVersionLatest tracks which version the front-end client should be. This allows us to give a warning to (or even deny) users who are connecting to the database while not using the right version of the front-end client.

              Please see "Is a Back-end structural change going to be a hassle?" on our Rules to Successful Projects.​

            23. Schema - Do you use computed columns rather than denormalized fields?

              When you have a denormalized field, use a computed column.  In SQL Server ​they can be persisted.

              Use the suffix "Computed" to clearly distinguish that this field is a computed field.

              NormalizedFields_Bad.jpg style=

              Figure: Bad Example - This field was manually​ updated from code in the middle tier.
              NormalizedFields_Good.jpg
              Figure: Good Example​ - There was no code in the middle tier to calculate this (and it has the correct name)

              Computed columns have some limitations - they cannot access fields in other tables, or other computed fields in the current table.

              You can use user-defined functions (UDF) from code in a reusable function, this allows one computed column to use a function to call another function.  Here is an example:​

              ALTER FUNCTION [dbo].[udfEmpTime_TimeTotalComputed]

              (
              @TimeStart as DateTime,
              @TimeEnd as DateTime
              )
              RETURNS DECIMAL(8,6)
              AS
              BEGIN
              -- This function returns the time difference in hours - decimal(8,6)
              ​RETURN (round(isnull(CONVERT([decimal](8,6),@TimeEnd - @TimeStart,(0))*(24),(0)),(2)))

              END​

              Figure: This is the user defined function
              NormalizedFieldsDefine.jpg
              Figure: Sett​ing up a computed column in the table designer​


            24. Schema - Do you use triggers for denormalized fields?

              Ideally you should be using computed columns as per https://rules.ssw.com.au/use-computed-columns-rather-than-denormalized-fields​

              ​​You can also have a denormalized field that is manually updated.  This should be the exception and not the rule.  When used properly and sparingly, they can actually improve your application's performance. As an example:

              • You have an Orders table containing one record per order
              • You also have an OrderItems table which contains line items linked to the main OrderID, as well as subtotals for each line item
              • In your front end,​ you have a report showing the total for each order​

              To generate this report, you can either:

              1. Calculate the Order total by summing up every single line item for the selected Order every time the report is loaded, or
              2. Store the Order subtotal as a de-normalised field in the Orders table which gets updated using trigger.

              The second option will save me an expensive JOIN query each time because you​ can just tack the denormalised field onto the end of my SELECT query.

              ​1. Code: Alter Orders table​

              ALTER TABLE Orders
              ADD SumOfOrderItems money NULL


              2. Code: Insert trigger

              Alter Trigger tri_SumOfOrderItems
              On dbo.OrderItems
              For Insert
              AS
              DECLARE @OrderID varchar (5)
              SELECT @OrderID = OrderID from inserted
              UPDATE Orders
              SET Orders.SumOfOrderItems = Orders.SumOfOrderItems +
              (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
              WHERE Orders.OrderID = @OrderID


              3. Code: Update trigger

              Alter Trigger tru_SumOfOrderItems
              On dbo.OrderItems
              For Update
              AS
              DECLARE @OrderID varchar (5)
              SELECT @OrderID = OrderID from deleted
              --Could have used inserted table
              UPDATE Orders
              SET Orders.SumOfOrderItems = Orders.SumOfOrderItems
              + (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
              - (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
              WHERE Orders.OrderID = @OrderID

              4. Code: Delete trigger

              Alter Trigger trd_SumOfOrderItems
              On dbo.OrderItems
              For Delete
              AS
              DECLARE @OrderID varchar (5)
              SELECT @OrderID = OrderID FROM deleted
              UPDATE Orders
              SET Orders.SumOfOrderItems = Orders.SumOfOrderItems -
              (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
              WHERE Orders.OrderID = @OrderID

              5. Code: Maintenance stored procedure

              --Stored Procedure for Maintenance
              Alter Procedure dt_Maintenance_SumOfItemValue
              As
              UPDATE Orders
              SET Orders.SumOfOrderItems = Isnull((SELECT SUM (ItemValue) FROM OrderItems WHERE OrderItems.OrderID = Orders.OrderID),0)

            25. Schema - Do you validate each "Denormalized Field" with procValidate?

              Ideally you should be using computed columns as per https://rules.ssw.com.au/use-computed-columns-rather-than-denormalized-fields​​​

              Many of the databases that SSW works with make use of denormalized fields. We believe this is with good reason. However, several precautions should be taken to ensure that the data held within these fields is reliable. This is particularly the case several applications are updating your denormalized data. To illustrate, let's say that we want to show all Customers with a calculated field totalling their order amount (ie Customer.OrderTotal).

              With this example in mind, the main reasons we use denormalized fields are:

               

              Reducing development complexity

              A denormalized field can mean that all SELECT queries in the database are simpler. Power users find it easier to use for reporting purposes - without the need for a cube. In our example, we would not need a large view to retrieve the data (as below).

              SELECT 
              Customer.CustomerID, SUM (SalesOrderDetail.OrderQty * (SalesOrderDetail.UnitPrice - SalesOrderDetail.UnitPriceDiscount) ) AS DetailTotal, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate, Customer.rowguid FROM Customer
              INNER JOIN SalesOrderHeader ON Customer.CustomerID = SalesOrderHeader.CustomerID INNER JOIN SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID GROUP BY Customer.CustomerID, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate,Customer.rowguid ORDER BY Customer.CustomerID
              Figure: A view to get customer totals when no denormalized fields are used
              If we had a denormalized field, the user or developer would simply have run the following query:
              SELECT 
              Customer.CustomerID,
              Customer.OrderTotal AS DetailTotal FROM Customer ORDER BY Customer.CustomerID
              Figure: Queries are much simpler with denormalized fields

              Note that this is not a particularly complicated example. However, you can see why it can simplify development greatly when working with a large number of tables

              Performance is better for read-intensive reports
              Particularly when reporting on data with a cube.

              When there a multiple tables in a SQL Server view They cannot be updated in one hit - they must be updated one table at a time.  

              It is a built-in validation device For example, if records are accidentally deleted directly in the database, there is still a validation check for the correct totals. The value of this is mitigated when there is a full audit log on the database

              However, there are reasons against using denormalized fields

              They have to be maintained and can potentially get out of synch

              This can make them unreliable - particularly if several applications are incorrectly updating the denormalized fields. UPDATE, INSERT, DELETEs are more complicated as they have to update the denormalized fields

              They can be seen as an unnecessary waste of space

              All in all, we choose to still use denormalized fields because they can save development time. We do this with some provisos. In particular, they must be validated correctly to ensure the integrity of the data.

              Here is how we ensure that this data is validated:

              1. Change the description on any denormalized fields to include "Denormalized" in the description - "Denormalized: Sum(OrderTotal) FROM Orders" in description in SQL Server Management Studio.
              2. Create a view that lists all the denormalized fields in the database - based on the description field.
                CREATE VIEW dbo.vwValidateDenormalizedFields
                AS
                    SELECT OBJECT_NAME(id) AS TableName, 
                        COL_NAME(id, smallid) AS ColumnName,
                        CAST([value] AS VARCHAR(8000)) AS Description,
                        'procValidate_' + OBJECT_NAME(id) + 
                        '_' + COL_NAME(id, smallid) as
                        ValidationProcedureName
                    FROM dbo.sysproperties
                    WHERE (name = 'MS_Description') AND 
                                 (CAST([value] AS VARCHAR(8000))
                                  LIKE '%Denormalized:%')
                
                Figure: Standard view for validation of a denormalized field
              3. Create a stored procedure (based on the above view) that validates whether all denormalized fields have a stored procedure that validates the data within them
                CREATE PROCEDURE procValidateDenormalizedFieldValidators
                AS
                    SELECT 
                        ValidationProcedureName AS
                        MissingValidationProcedureName 
                    FROM vwValidateDenormalizedFields
                    WHERE ValidationProcedureName NOT IN
                    (
                        SELECT ValidationProcedureName
                        FROM vwValidateDenormalizedFields AS vw
                        LEFT JOIN sysobjects 
                        ON 
                            vw.ValidationProcedureName = 
                            OBJECT_NAME(sysobjects.id)
                        WHERE id IS NOT NULL
                    )
                
                Figure: Standard stored procedure for validation of a denormalized field
              If you want to know how to implement denormalized fields, see our rules Do you use triggers for denormalized fields?
            26. Schema - Do you avoid using user-schema separation?

              ​​User-schema separation allows more flexibility​ by adding another level of naming and shifting ownership of database objects to the schema, not the user. So, is it worth doing? Unless you are working with a very large database (100+ tables), the answer is "no". Most smaller databases have all objects with owner "dbo", which is fine in most cases.​

              SQLDatabases_UserSchema_Bad.jpg
              ​Figure: Bad Example - AdventureWorks using user schema - instead, keep it simple and avoid using user schema unnecessarily
              SQLDatabases_UserSchema_Good.jpg
              Figure: Good Example -​ Adventure works with user schema cleaned out. Much simpler and more readable​​
            27. Schema - Do you create a consistent primary key column on your tables?

              ​Make sure you created a consistent primary key column named Id on your tables.

              Employee.ID, Employee.EmployeeId, Employee.EmployeeID, Employee.Employee_Code, Employee.Employee

              Figure: Bad Example​​​

              ​ Employee.Id

              Figure: Good Example

              ​​​Why?

              • ​We shouldn’t capitalise ID (identifier) as it is an abbreviation not an acronym.
              • ​Using the a​pproach [TableName]Id, e.g. EmployeeId, is redundant as we already know the context of the Id.​

            28. Schema - Do you use separate lookup tables rather than one large lookup table for your lookup data?

              Advantage: Simplifies ORM Mapping
              We prefer multiple lookup tables so they make more sense in ORM tools. If you have multiple lookups to the one table, you would need to do your mappings manually rather than using a tool. E.g. you could have either: LookupTable or OrderType​

              When you are obtaining the OrderType for an order, you would have either:

              • ​Order.OrderType.OrderTypeID 

              Good as it is clear what is being retrieved from the lookup table.

              • Order.LookupTable.Value 

              Not great as it is not clear what the nature of the lookup table is.


              Advantage: Maintains Complete Referential Integrity without the need for triggers
              ​The other advantage of having separate lookup tables rather than one large one is that referential integrity is maintained.
              One issue with having one large table is that you can still enter invalid values in the Order.OrderTypeID column. E.g. if Order TypeIDs range from 1-3 and CustomerTypeIDs range from 4 to 10.

              If I put OrderTypeID = 10, then I will not get referential integrity errors (even though I should) because I have entered a value which exists in the lookup table (even though it is for the wrong type).

              If I want to enforce referential integrity so I can only enter the correct type for my lookup table, then I would need to resort to triggers or a (fallible) coded data tier.

              ​​Advantage: You can add new columns specific to each lookup table

              ​​​For example, if a Lookup table (e.g. CustomerType) has an associated value (e.g. the field MaximumDebtAmount), we don't need to add a field that is irrelevant to all the other lookup tables. We can just add it to the individual lookup table.​


              ​​Disadvantage: Multiple tables make maintenance slightly more difficult, especially when making changes directly via Management Studio

              ​It is simpler to Administer one table than multiple tables, but you can reduce this problem with a good Generic Administration Page UI.






            29. Schema - Do you know how to provide best database schema document?

              ​​​​​​​​You should not provide a database schema via several screen captures - it has little information about the details. A well-formatted Word document may be providing more details information, but it is not easy to maintain the document to keep it up-to-date. The best way is to automatically generate your document with a tool.

              We recommend and use Red-Gate SQL Doc to produce chm help files or html pages of the database schema. SQL Doc also allows you to run via the command line so you can include the generation in your build process to be automatically created.

              We have also have used other available tools in the past, such as Apex SQL Doc.​

              Alternatively, you can use SQL Management Studio to generate a Database diagram.
              1. Connect to your database using SQL Server Management Studio
              2. Create a new Database Diagram, by right-clicking Database Diagrams

              Figure: New Database Diagram

              3. A popup will appear. Shift-Click to select all the tables then click Add

              Figure: Selecting tables for diagram

              4. You will see tables populate behind the dialogue box, once complete click Close

              Figure: Tables populated

              5. Click off the tables in the diagram and Ctrl+A to Select all
              6. Right-Click one of the tables in the diagram and perform the following

              a. ​Select Table View | Standard from the menu

              b. ​​Select Autosize Selected Tables from the menu 

              Figure: Changing the database table diagram to Standard View and Autosize​

              7. Right-click the diagram background and select Show Relationship Labels

              Figure: Show Relationship Labels

              8. Move the tables around so that the Relationship Labels are clearly visible.

              Note: You will need to screenshot the diagram as using the copy to clipboard function removes the “Allow Nulls” checkmarks.

              Figure: Northwind Database Schema





            30. Schema - Do you add zs prefix to system tables?

              Any type of table in a database where that does not contain application data should be called zs. So when the other application (e.g. SSW SQL Deploy) or the programmer populates the table then it should be called zs (e.g. zsDate - the program populates it, zsVersion - the programmer populates it).​​

            31. Views – Do you avoid having views as redundant objects?

              Don't have views as redundant objects. e.g. vwCustomers as SELECT * FROM Customers. This is unnecessary. Instead, Views should be generally used for security.​​

            32. Stored Procedures - Do you keep your Stored Procedures simple?

              If you are using the .NET Framework, put validation and defaults in the middle tier. The backend should have the required fields (Allow Nulls = False), but no complicated constraints. The following are examples that work with the Products table (with an added timestamp field called Concurrency) from Northwind.

              ​1. Code: Select Procedure​​​

              ALTER PROCEDURE dbo.ProductSelect
              @ProductID int
              AS
              SELECT ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock,
              UnitsOnOrder, ReorderLevel, Discontinued, Concurrency
              FROM Products
              WHERE (ProductID= @ProductID)

              2. ​​Code: Insert Procedure​

              ALTER PROCEDURE dbo.ProductInsert
              @ProductName nvarchar(40),
              @SupplierID int,
              @CategoryID int,
              @QuantityPerUnit nvarchar(20),
              @UnitPrice money,
              @UnitsInStock smallint,
              @UnitsOnOrder smallint,
              @ReorderLevel smallint,
              @Discontinued bit
              AS
              INSERT INTO Products (ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice,
              UnitsInStock, UnitsOnOrder, ReorderLevel, Discontinued)
              VALUES (@ProductName, @SupplierID, @CategoryID, @QuantityPerUnit, @UnitPrice, @UnitsInStock,
              @UnitsOnOrder, @ReorderLevel, @Discontinued, 1)
              SELECT Scope_Identity() AS [SCOPE_IDENTITY] --If table has identity column
              --SELECT @@ROWCOUNT --If table doesn't have identity column
              -- Note: The middle tier must check the ROWCOUNT = 1

              3.Code: Update Procedure​​

              ALTER PROCEDURE dbo.ProductUpdate
              @ProductID int,
              @ProductName nvarchar(40),
              @SupplierID int,
              @CategoryID int,
              @QuantityPerUnit nvarchar(20),
              @UnitPrice money,
              @UnitsInStock smallint,
              @UnitsOnOrder smallint,
              @ReorderLevel smallint,
              @Discontinued bit,
              @Concurrency timestamp
              UPDATE Products
              SET ProductName = @ProductName,
              SupplierID = @SupplierID,
              CategoryID = @CategoryID,
              QuantityPerUnit = @QuantityPerUnit,
              UnitPrice = @UnitPrice,
              UnitsInStock = @UnitsInStock,
              UnitsOnOrder = @UnitsOnOrder,
              ReorderLevel = @ReorderLevel,
              Discontinued = @Discontinued
              WHERE (Concurrency = @Concurrency) AND (ProductID= @ProductID) --Note the double criteria to ensure concurrency
              SELECT @@ROWCOUNT
              -- Note: The middle tier must check the ROWCOUNT = 1

              4.Code: Delete Procedure​

              ALTER PROCEDURE dbo.ProductDelete
              @ProductID int,
              @Concurrency timestamp
              AS
              DELETE FROM Products
              WHERE (ProductID= @ProductID) AND (Concurrency = @Concurrency)
              --Note the double criteria to ensure concurrency
              SELECT @@ROWCOUNT
              --Note: The middle tier must check the ROWCOUNT = 1​​

            33. Stored Procedures - Do you return a value indicating the status?

              ​Make sure your stored procedures always return a value indicating the status. All stored procedures should return the error number (if an error) or a 0 to indicate no errors (ie success).​​

            34. Stored Procedures - Do you standardize the return values of stored procedures for success and failures?

              Standardize on the return values of stored procedures for success and failures.​​

            35. Stored Procedures - Do you use OUTPUT parameters if you need to return the value of variables?

              ​The RETURN statement is meant for returning the execution status only, but not data. If you need to return the value of variables, use OUTPUT parameters. There is a compelling reason for this - if you use return values rather than output values to return data, money values that you return will silently be truncated.​​

            36. Stored Procedures - Do you check the global variable @@ERROR after executing a data manipulation statement?

              Always check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default, SQL Server will not rollback all the previous changes within a transaction if a particular statement fails. This behaviour can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT variable also plays an important role in determining how many rows were affected by a previous data manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a particular transaction.​​

            37. Stored Procedures - Do you use SCOPE_IDENTITY() to get the most recent row identity?

              When inserting a row in a stored procedure, always use SCOPE_IDENTITY() if you want to get the ID of the row that was just inserted. A common error is to use @@IDENTITY, which returns the most recently created identity for your current connection, not necessarily the identity for the recently added row in a table. You could have a situation where there is a trigger that inserts a new record in a Logs Table, for example, when your Stored Procedure or INSERT SQL Statement inserts a record in the Orders Table. If you use @@IDENTITY to retrieve the identity of the new order, you will actually get the identity of the record added into the Log Table and not the Orders Table, which will create a nasty bug in your data access layer. To avoid the potential problems associated with someone adding a trigger later on, always use SCOPE_IDENTITY() to return the identity of the recently added row in your INSERT SQL Statement or Stored Procedure.

              ​Behold this example from SQL Server Books online.

              USE tempdb
              GO
              CREATE TABLE TZ (
              Z_id int IDENTITY(1,1)PRIMARY KEY,
              Z_name varchar(20) NOT NULL)
              INSERT TZ
              VALUES ('Lisa')
              INSERT TZ
              VALUES ('Mike')
              INSERT TZ
              VALUES ('Carla')
              SELECT * FROM TZ
              --Result set: This is how table TZ looks.
              Z_id Z_name
              -------------
              1 Lisa
              2 Mike
              3 Carla
              CREATE TABLE TY (
              Y_id int IDENTITY(100,5)PRIMARY KEY,
              Y_name varchar(20) NULL)
              INSERT TY (Y_name)
              VALUES ('boathouse')
              INSERT TY (Y_name)
              VALUES ('rocks')
              INSERT TY (Y_name)
              VALUES ('elevator')
              SELECT * FROM TY
              --Result set: This is how TY looks:
              Y_id Y_name
              ---------------
              100 boathouse
              105 rocks
              110 elevator
              /*Create the trigger that inserts a row in table TY
              when a row is inserted in table TZ*/
              CREATE TRIGGER Ztrig
              ON TZ
              FOR INSERT AS
              BEGIN
              INSERT TY VALUES ('')
              END
              /*FIRE the trigger and determine what identity values you obtain
              with the @@IDENTITY and SCOPE_IDENTITY functions.*/
              INSERT TZ VALUES ('Rosalie')
              SELECT SCOPE_IDENTITY() AS [SCOPE_IDENTITY]
              GO
              SELECT @@IDENTITY AS [@@IDENTITY]
              GO


              Notice the difference in the result sets. As you can see, it's crucial that you understand the difference between the 2 commands in order to get the correct ID of the row you just inserted.

              SCOPE_IDENTITY
              4
              /*SCOPE_IDENTITY returned the last identity value in the same scope. This was the insert on table TZ.*/
              @@IDENTITY
              115
              /*@@IDENTITY returned the last identity value inserted to TY by the trigger. This fired because of an earlier insert on TZ.*/​

            38. Stored Procedures - Do you SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes?

              ​You should use SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes (i.e. when you want the rowcounts to display as the messages from your T-SQL).​​

              According to SQL Server Books Online:
              "For stored procedures that contain several statements that do not return much actual data, this can provide a significant performance boost because network traffic is greatly reduced."

              Example: Procedure that returns a scalar value (ClientID generated by an insert statement) should use OUTPUT keyword (not RETURN) to pass back data. This is how you should return a generated ClientID from the procedure, and also return a status value

              CREATE PROCEDURE procClientInsert
              /*
              '----------------------------------------------
              ' Copyright 2001 SSW
              ' www.ssw.com.au All Rights Reserved.
              ' VERSION AUTHOR DATE COMMENT
              ' 1.0 DDK 17/12/2001
              '
              'Calling example
              'DECLARE @pintClientID int
              'DECLARE @intReturnValue int
              'exec @intReturnValue = procClientInsert 'TEST Entry',
              @pintClientID OUTPUT
              'PRINT @pintClientID
              'PRINT @intReturnValue
              '----------------------------------------------
              */
              @pstrCoName varchar (254),
              @pintClientID int OUTPUT
              AS
              --IF ONE THING FAILS, ROLLBACK
              SET XACT_ABORT ON
              --THE COUNT WILL NOT NORMALLY DISPLAY IN AN APPLICATION IN PRODUCTION.
              --GET RID OF IT BECAUSE IT IS EXTRA TRAFFIC, AND CAN CAUSE
              PROBLEMS WITH SOME CLIENTS
              SET NOCOUNT ON
              --Generate a random number
              SET @pintClientID = (SELECT CAST(RAND() * 100000000 AS int))
              INSERT INTO Client (ClientID, CoName) VALUES (@pintClientID ,
              @pstrCoName)
              SET XACT_ABORT OFF
              IF @@ROWCOUNT = 1
              RETURN 0 -- SUCCESS
              ELSE
              BEGIN
              IF @@ERROR=0
              RETURN 1 -- FAILURE
              ELSE
              RETURN @@ERROR -- FAILURE
              END
              SET NOCOUNT OFF


              This procedure will display 0 or the error to indicate success or failure. You should base your actions on this return code.

              This separates return values from actual data so that other programmers know what to expect.

              Note: If you are using SQL Server stored procedures to edit or delete data using a SqlDataAdapter, make sure that you do not use SET NOCOUNT ON in the stored procedure definition. This causes the rows affected count returned to be zero, which the DataAdapter interprets as a concurrency conflict. In this event, a DBConcurrencyException will be thrown.​

            39. Stored Procedures - Do you avoid starting user stored procedures with system prefix "sp_" or "dt_"?

              System stored procedures are created and stored in the master database and have the sp_ prefix. System stored procedures can be executed from any database without having to qualify the stored procedure name fully using the database name master. It is strongly recommended that you do not create any stored procedures using sp_ as a prefix. SQL Server always looks for a stored procedure beginning with sp_ in this order:

              1. The stored procedure in the master database.
              2. The stored procedure based on any qualifiers provided (database name or owner).
              3. The stored procedure using dbo as the owner, if one is not specified.

              Therefore, although the user-created stored procedure prefixed with sp_ may exist in the current database, the master database is always checked first, even if the stored procedure is qualified with the database name.

              Important: If any user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.

            40. Stored Procedures - Do you avoid using SELECT * when inserting data?

              ​Using a statement like "INSERT tableName SELECT * FROM otherTable", makes your stored procedures vulnerable to failure. Once either of the two tables change, your stored procedure won't work. Not only that, when the inserting table has an identity column, such a statement will cause an error - "An explicit value for the identity column in table ParaRight can only be specified when a column list is used and IDENTITY_INSERT is ON."

              ​USE [ParaGreg]
              GO
              /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
              SET ANSI_NULLS ON
              GO
              SET QUOTED_IDENTIFIER ON
              GO
              ALTER PROCEDURE [dbo].[procMove]
              @id AS Char,
              @direction AS INT
              AS
              IF @direction = 0
              BEGIN
              INSERT INTO ParaRight
              SELECT * FROM ParaLeft
              WHERE ParaID = @id
              DELETE FROM ParaLeft
              WHERE ParaID = @id
              END
              ELSE IF @direction = 1
              BEGIN
              INSERT INTO ParaLeft
              SELECT * FROM ParaRight
              WHERE ParaID = @id
              DELETE FROM ParaRight
              WHERE ParaID = @id
              END

              ​Figure: Bad Example - Using SELECT * when inserting data. Besides, this stored procedure should have an Else section to raise error when no condition is satisfied


              USE [ParaGreg]
              GO
              /****** Object: StoredProcedure [dbo].[procMove] Script Date: 08/08/2008 12:18:33 ******/
              SET ANSI_NULLS ON
              GO
              SET QUOTED_IDENTIFIER ON
              GO
              ALTER PROCEDURE [dbo].[procMove]
              @id AS Char,
              @direction AS INT
              AS
              IF @direction = 0
              BEGIN
              INSERT INTO ParaRight
              SELECT Col1,Col2 FROM ParaLeft
              WHERE ParaID = @id
              DELETE FROM ParaLeft
              WHERE ParaID = @id
              END
              ELSE IF @direction = 1
              BEGIN
              INSERT INTO ParaLeft
              SELECT * FROM ParaRight
              WHERE ParaID = @id
              DELETE FROM ParaRight
              WHERE ParaID = @id
              END
              ELSE BEGIN PRINT "Please use a correct direction"
              END

              Figure: Good Example - Using concrete columns instead of * and provide an Else section to raise errors​​
            41. Stored Procedures - Do you use transactions for complicated stored procedures?

              ​A transaction means an atomic operation, it assures that all operations within the transaction are successful, if not, the transaction will cancel all operations and roll back to the original state of the database, that means no dirty data and mess exists in the database, so if a stored procedure has many steps, and each step has relation with other steps, it is strongly recommended that you encapsulate the procedure in a transaction.​

              ALTER PROCEDURE [dbo].[procInit]
              AS
              DELETE ParaLeft
              DELETE ParaRight
              INSERT INTO ParaLeft (ParaID)
              SELECT ParaID FROM Para

              Figure: Bad Example - No tran​saction here, if any of operations fail, the database will only partially update, resulting in an unwanted result.

              ALTER PROCEDURE [dbo].[procInit]
              AS
              BEGIN TRANSACTION
              DELETE ParaLeft
              DELETE ParaRight
              INSERT INTO ParaLeft (ParaID)
              SELECT ParaID FROM Para
              COMMIT

              Figure: Good Example - Using a transaction to assure that all operations within the transaction will be successful, otherwise, the database will roll back to the original state​.
            42. Stored Procedures - Do you use error handling in your Stored Procedures?

              ​​​​You should always include error handling in your stored procedures, it allows you to catch errors and either log them or attempt to correct them.

              THROW (Transact-SQL) lets you generate your own custom error messages, which can be more detailed in describing the problem and assist in debugging.​​​

              ​​​​Here’s an example of the syntax used when implementing THROW.

              ​​-- Syntax
              THROW error_number, message, state;​

              ​​Figure: Example of the THROW syntax

              There are 3 main arguments:​​​

              • error_number (int) - Must be greater than or equal to 50000 and less than or equal to 2147483647.
              • message (nvarchar) - Maximum of 2048 characters.​
              • state (tinyint) - Must be between 0 and 255​
              The state argument can be used to help pinpoint where the error occurred by using a different value without changing the error_number or message. This is useful if you have multiple steps in a process that may throw identical error descriptions.

              -- Example
              THROW 51000, 'The record does not exist.', 1;

              ​​Figure: Example of using THROW

              Implementing Error Handling using THROW​

              Here we are generating a divide-by-zero error to easily raise a SQL exception and is used as a place holder for logic that we would have in our stored procedure.​

              ​DECLARE @inputNumber AS INT = 0;
               
              -- Generate a divide-by-zero error
              SELECT 1 / @inputNumber AS Error;

              ​​Figure: Bad Example - No error handling.

              ​Below we have wrapped our stored procedure logic in a TRY block and added a CATCH block to handle the error. More information can be found here TRY...CATCH (Transact-SQL).

              We know this divide-by-zero is going to cause an exception and the error number for this specific SQL exception is 8134. See (MSSQL Errors) for more error numbers.

              In our CATCH block, we check the error to ensure it’s the one that we want to handle otherwise, we re-throw the original exception.
              Finally, when we catch the error we are looking for we can log some information about it and attempt to run our stored procedure logic again with different parameters.

              ​DECLARE @errorCode AS INT;
              DECLARE @inputNumber AS INT;
               
              BEGIN TRY
                     -- Generate a divide-by-zero error
                     SET @inputNumber = 0;
                     SELECT 1 / @inputNumber AS Error;
              END TRY
              BEGIN CATCH
                     SET @errorCode = (SELECT ERROR_NUMBER());
                     IF @errorCode = 8134 -- Divide by zero error encountered.
                            BEGIN
                                  PRINT 'Divide by zero error encountered. Attempting to correct'
                                   SET @inputNumber = 1;
                                  SELECT 1 / @inputNumber AS Error;
                           END
                     ELSE
                            THROW;
              END CATCH;

              ​​​Figure: Good Example - Using error handling to catch an error and attempt to resolve it.

              ​The example below shows how you can catch an error and retrieve all the details about it.
              This is very useful if you want to save these errors to another table or trigger a stored procedure.​

              ​BEGIN TRY
                     -- Generate a divide-by-zero error. 
                     SELECT 1 / 0 AS Error;
              END TRY
              BEGIN CATCH
                     SELECT
                           ERROR_NUMBER() AS ErrorNumber,
                           ERROR_STATE() AS ErrorState,
                           ERROR_SEVERITY() AS ErrorSeverity,
                           ERROR_PROCEDURE() AS ErrorProcedure,
                           ERROR_LINE() AS ErrorLine,
                           ERROR_MESSAGE() AS ErrorMessage;
               
                           -- Insert logic for persisting log information​ (Log to table or log to file)
               
                           THROW;
              END CATCH;​

              ​​Figure: Good Example - Using error handling to catch an error and retrieving its details, allowing it to be logged.
            43. Stored Procedures - Do you know SQL stored procedure names should be prefixed with the owner?

              ​​Always specify the schema prefix when creating stored procedures. This way you know that it will always be dbo.procedure_name no matter who is logged in when it is created.

              There are 2 other benefits to including the schema prefix on all object references:

              1. This prevents the database engine from checking for an object under the users schema first
              2. Also avoids the issue where multiple plans are cached for the exact same statement/batch just because they were executed by users with different default schemas​.

              Aaron Bertrand agrees with this rule - My stored procedure "be​st practices" checklist.

              CREATE PROCEDURE procCustomer_Update @CustomerID INT, ….. BEGIN​

              Figure: Bad example​​

              ​​​CREATE PROCEDURE dbo.procCustomer_Update @CustomerID INT, ….. BEGIN

              Figure: Good example​​​​
            44. Relationships - Do you turn on referential integrity in relationships?

              ​Cascading referential integrity constraints allow you to define the actions SQL Server takes when a user attempts to delete or update a key to which existing foreign keys point. The REFERENCES clauses of the CREATE TABLE and ALTER TABLE statements support ON DELETE and ON UPDATE clauses:

              • [ ON DELETE { CASCADE | NO ACTION } ]
              • [ ON UPDATE { CASCADE | NO ACTION } ]

              NO ACTION is the default if ON DELETE or ON UPDATE is not specified.​​

              ​Relationships should always have referential integrity turned on. If you turned it on after data has been added, you may have data in your database that violates your referential integrity rules.

              ReferentialIntegrityCheck.jpg
              Figure: Recommended referential integrity constraints
            45. Relationships - Do you use Update Cascade when creating a relationship?

              ​The ON UPDATE CASCADE feature of SQL Server 2000 and above can save you time in writing application and stored procedure code. We recommend that you take advantage of it. It is also more efficient than using triggers to perform these updates.​​moved 

            46. Relationships - Do you avoid using Cascade Delete?

              ​SQL Servers ON DELETE CASCADE functionality can be very dangerous. We recommend not using it. Imagine someone deletes customer and the orders are deleted. If you need to delete records in related tables, do it in code in the application as it gives you more control.​
            47. Relationships - Do you set Not For Replication when creating a relationship?

              When NOT FOR REPLICATION is used with a Foreign Key relationship, the integrity of the relationship is not checked while the Replication Agent is logged in and performing replication operations. This allows changes to the data (such as cascading updates) be propagated correctly.
            48. Relationships - Do you have FOREIGN KEY constraints on columns ending with ID?

              Columns ending with 'ID' should have FOREIGN KEY constraints​.
              NorthwindRelationships.jpg
              Figure: Missing relationships​
            49. General - Do you know object name should not be a reserved word?

              SQL Server reserves certain keywords for its exclusive use. It is not legal to include the reserved keywords in a Transact-SQL statement in any location except that defined by SQL Server. No objects in the database should be given a name that matches a reserved keyword. If such a name exists, the object must always be referred to using delimited identifiers. Although this method does allow for objects whose names are reserved words, it is recommended that you do not name any database objects with a name that is the same as a reserved word. In addition, the SQL-92 standard implemented by Microsoft SQL Server defines a list of reserved keywords.​​​

              ​Avoid using SQL-92 reserved keywords for object names and identifiers, ie. User, Count, Group, etc. They can be used if joined with other words.
              What are reserved words for SQL Server 2000?
              Why avoid reserved words and spaces in object names?

            50. General - Do you know object name should not contain spaces?

              Spaces should be avoided. If an object name contains a space, it can cause problems later on for developers because the developer must remember to put the object name inside square brackets when referencing it.

              Why avoid reserved words and spaces in object names?

              ​We aim to never have to use square brackets in any of our databases.​​​​

            51. General - Do you know to not use "sp_rename" to rename objects?

              ​Do not use "sp_rename" to rename objects like stored procedures, views and triggers.​

              ​​​Object name should be the same as the name used in the object's script (e.g. CREATE script for stored procedures, views and triggers). Inconsistency can happen when object is renamed with sp_rename, but its script is not updated.
            52. General - Do you know object name should follow your company Naming Conventions?

              1. ​​SQL Server Object Naming Standard SSW's Standard for naming SQL Server Objects.
              2. SQL Server Stored Procedure Naming Standard SSW's Standard for naming Stored Procedures.
              3. SQL Server Indexes Naming S​tandard SSW's Standard for naming Indexes.
              4. SQL Server Relationship Naming Standard SSW's Standard for naming Relationships
              5. Use decreasing generality for table names ie. Client and ClientInvoice, then ClientInvoiceDetail.
              6. Don't use underscores, instead use upper and lower case ie. ClientInvoice is preferred over Client_Invoice.
              7. Table names should use pluralisation ie. Clients is preferred over Client.
              8. Generally, do not use abbreviations. But there are a few words that are so commonly used that they can be abbreviated. These are:
                • Quantity = Qty
                • Amount = Amt
                • Password = Pwd
              9. Prefix all Date fields with 'Date' ie. DateInvoiced. One extra use of this is you can have generic code that enables a Date control on this field.
              10. Suffix Percent fields with 'Pct' ie. SalesTaxPct.
              11. Only use alphabet characters. ie. don't use AustraliaListA$. Avoid the following characters in your object names in SQL Server. If you do not do this, you will need to constantly identify those ill-named objects with bracketed or quoted identifiers - otherwise, unintended bugs can arise.
              12. Don't use reserved words on their own. ie. User, Count, Group, etc. They can be used if joined with other words. Reserved Keywords (Transact-SQL)

            53. General - Do you use a SQL Server Object Naming Standard?

              ​​This standard outlines the standard on naming objects within SQL Server. Use these standards when naming any object or if you find an older object that doesn't follow these standards within SSW.

              ​​All feedback is welcome. Thanks to all who gave feedback. A special thanks to Andy Baron.​


              ​Object
              Prefix
              Example
              ​Table
              --​
              ​​Clients
              ​Column (PK)
              ​--
              ​Id
              ​Column (FK)
              ​--
              ​ClientId
              ​Temporary Table
              ​_zt
              ​_ztClients
              ​System Table
              ​_zs
              ​_zsDataVersion
              _zsVersionLatest
              ​View
              ​vw, gy_
              vwClientsWithNoPhoneW
              gy_ClientsWithNoPhoneW​
              ​Stored Procedure
              ​proc, gp_
              ​procSelectClientsClientID
              gp_SelectClientsClientID
              ​Trigger
              ​trg
              ​trgOrderIU
              ​Default*
              ​dft *
              ​dftToday *
              ​Rule
              ​rul
              ​rulCheckZIP
              ​User-Defined Datatype
              ​udt
              ​udtPhone
              ​User-Defined Functions
              ​udf
              ​udfDueDates

              * We never use defaults as objects, this is really an old thing that is just there for backwards compatibility. Much better to use a default constraint.

              ​​​​

              Other Links

              SQL Server Coding Standards - Part 1​
              By Steve Jones on SQL Server Central

            54. General - Do you use a SQL Server Stored Procedure Naming Standard?

              ​This standard outlines the standard on naming Stored Procedures within SQL Server. Use these standards when creating new Stored Procedures or if you find an older Stored Procedure that doesn't follow these standards within SSW.​

              Note: Stored Procedures will run fractionally slower if they start with a prefix of sp_   This is because SQL Server will look for a system stored proc first. Therefore we never recommend starting stored procs with a prefix of sp_
              Do you agree with them all? Are we missing some? Let us know what you think.

              Syntax

              Stored Procedure names are to have this syntax:
              [proc] [MainTableName] By [FieldName(optional)] [Action]
              [  1  ] [         2          ]     [       3                  ] [   4    ]
              [1] All stored procedures must have the prefix of 'proc'. All internal SQL Server stored procedures are prefixed with "sp_", and it is recommended not to prefix stored procedures with this as it is a little slower.
              [2] The name of the table that the Stored Procedure accesses.
              [3] (optional) The name of the field that are in the WHERE clause. ie. procClientByCoNameSelect, procClientByClientIDSelect
              [4] Lastly the action which this Stored Procedure performs.

              If Stored Procedure returns a recordset then suffix is 'Select'.
              If Stored Procedure inserts data then suffix is 'Insert'.
              If Stored Procedure updates data then suffix is 'Update'.
              If Stored Procedure Inserts and updates then suffix is 'Save'.
              If Stored Procedure deletes data then suffix is 'Delete'.
              If Stored Procedure refreshes data (ie. drop and create) a table then suffix is 'Create'.
              If Stored Procedure returns an output parameter and nothing else then make the suffix is 'Output'.

              ​ALTER PROCEDURE procClientRateOutput

                       @pstrClientID VARCHAR(6) = 'CABLE',
                       @pstrCategoryID VARCHAR(6) = '<All>',
                       @pstrEmpID VARCHAR(6)='AC',
                       @pdteDate datetime = '1996/1/1',
                       @curRate MONEY OUTPUT

              AS

              -- Description: Get the $Rate for this client and this employee
              --         and this category from Table ClientRate

              SET @curRate = (
                              SELECT TOP 1 Rate
                              FROM ClientRate
                              WHERE ClientID=@pstrClientID
                              AND EmpID=@pstrEmpID
                              AND CategoryID=@pstrCategoryID
                              AND DateEnd > @pdteDate
                              ORDER BY DateEnd
                             )

              IF @curRate IS NULL

                       SET @curRate =
              (
                              SELECT TOP 1 Rate
                              FROM ClientRate
                              WHERE ClientID=@pstrClientID
                              AND EmpID=@pstrEmpID
                              AND CategoryID='<ALL>'
                              AND DateEnd > @pdteDate
                              ORDER BY DateEnd
                             )

              RETURN

              ​Figure: Good Example - stored proc that returns only an output parameter

              ​​

              ​​​Select 'procGetRate' or 'sp_GetRate'
              Insert 'procEmailMergeAdd'

              ​​​Figure: Bad Example

              'procClientRateSelect'
              'procEmailMergeInsert'

              ​​​​​Figure: Good Example
            55. General - Do you use a SQL Server Indexes Naming Standard?

              ​​​This standard outlines the procedure on naming Indexes at SSW for SQL Server. Use this standard when creating new Indexes or if you find an older Index that doesn't follow that standard.

              ​Note: There is not a lot of use naming Indexes - we only do it when we are printing out documentation or using the 'Index Tuning Wizard' - then it becomes really handy.
              Do you agree with them all? Are we missing some? Let us know what you think.

              Index names are to have this syntax:
              [pkc_] [TableName] by [FieldName]
              [   1  ] [    2        ]      [    3       ]
              [1] All indexes must have a corresponding prefix.

              Prefix
              Type
              ​pkc_
              ​Primary Key, Clustered
              ​pknc_
              ​Primary Key, Non Clustered
              ​ncu_
              ​Non Clustered, Unique
              ​cu_
              Clustered, Unique
              ​nc_
              ​Non Clustered (Most Common)



              ​Make unique index name if possible. Ie. ProductName
              [2] The name of the table that the Index refers to.
              [3] The name of the column(s) that the Index refers to.

              ​​Index 'BillingID'
              Primary Key 'aaaaaClient_PK'

              ​​Figure: Bad Example

              ​​​​'nc_ClientDiary_BillingID'
              'pknc_ClientDiary_ClientID'

              ​Figure: Good Example
            56. General - Do you use a SQL Server Relationship Naming Standard?

              ​This standard outlines the procedure on naming Relationships at SSW for SQL Server. Use this standard when creating new Relationships or if you find an older Relationship that doesn't follow that standard.

              ​​Do you agree with them all? Are we missing some? Let us know what you think.

              Syntax

              Relationship names are to have this syntax:
              [PrimaryTable] - [ForeignTable]
              [        1       ] - [        2       ]

              [1] The table whose columns are referenced by other tables in a one-to-one or one-to-many relationship.
              Rather than accepting the default value i.e. ClientAccount_FK01 that is given from upsizing.

              Figure: Bad Example - using the default relationship name

              ​We recommend using Prod-ClientAccount.

              Figure: Good Example - using a more descriptive relationship name


              ​The good thing is when you look at the relationship from the other side it is there as well.

              Figure: Relationship name shown on the other table

              We also believe in using Cascade Updates - but never cascade deletes.

            57. General - Do you know every object name should be owned by dbo?

              ​​​The reason is that you avoid ownership chain problems. Where Mary owns an object, Fred can read the object and then he creates a proc and he gives permission to Tom to execute. But Tom cannot because there is a product chain of ownership.​​​​

              CREATE PROCEDURE [Adam Cogan].[Sales by Year]

              @Beginning_Date DateTime,

              @Ending_Date DateTime AS

              SELECT Orders.ShippedDate

              ,Orders.OrderID

              ,"vwOrderSubTotals".Subtotal

              ,DATENAME(yy,ShippedDate) AS Year

              FROM Orders

              INNER JOIN "vwOrderSubTotals"

              ON Orders.OrderID = "vwOrderSubTotals".OrderID

              WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

              Figure: Bad Example​

              CREATE PROCEDURE [dbo].[Sales by Year]

              @Beginning_Date DateTime,

              @Ending_Date DateTime AS

              SELECT Orders.ShippedDate

              ,Orders.OrderID

              ,"vwOrderSubTotals".Subtotal

              ,DATENAME(yy,ShippedDate) AS Year

              FROM Orders

              INNER JOIN "vwOrderSubTotals"

              ON Orders.OrderID = "vwOrderSubTotals".OrderID

              WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date

              Figure: Good E​​​xample​​​


            58. General - Do you know the naming convention for use on database server test and production?

              Generally, every client should have a dev and a test database, so the dev database needs to have the postfix "Dev" and the test database need to have the postfix "Test"(E.g. SSWCRMDev, SSWCRMTest). However, you don't need any postfix for the production database.​​​

              BadDBName.gif
              Figure: Bad Example - Database with bad names
              GoodDBName.gif
              Figure: Good Example - ​Database with standard names

            59. Middle Tier - Do you implement business logic in middle tier?

              ​Business logic/rules should be implemented in an object oriented language such as VB.NET and C#.  This dramatically increases the adaptability, extensibility and maintainability of the application.

              Implementing business logic in stored procedures have the disadvantage of being hard to test, debug and evolve, therefore, they should only implement basic data access logic.

              With the exception of some very heavy data oriented operations, it is excusable to use stored procedures to carry out some logic for performance reasons.

              Triggers are even more difficult as their behaviour is event based.  It is okay to use triggers for non-functional/infrastructural features such as logging changes or maintain more complex relational integrity which cannot be enforced by a simple relationship.​​

            60. Do you parameterize all input to your database?

              It is important to parameterize all input to your database and it’s easy to implement.
              Doing so will also reduce a lot of headaches down the track.

               ParameterizeSqlInputsXKCD.png
              Figure: What can happen if you don’t parameterize your inputs
              Source: xkcd.com​

              ​​

              ​Advantages
              • ​Prevents SQL injection attacks
              • Preserves types being sent to the database
              • Increased performance by reducing the number of query plans
              • Makes your code more readable

              ​SELECT Id, CompanyName, ContactName, ContactTitle
              FROM dbo.Customers
              WHERE CompanyName = 'NorthWind';

              ​Figure: Bad Example - Using a dynamic SQL query

              SELECT Id, CompanyName, ContactName, ContactTitle
              FROM dbo.Customers
              WHERE CompanyName = @companyName;

              ​Figure: Good Example - Using a parameterized query​

              ​​Should I use Parameters.AddWithValue()?

              Using Parameters.AddWithValue() can be a bit of a shortcut as you don’t need to specify a type. However shortcuts often have their dangers and this one certainly does.

              For most cases Parameters.AddWithValue() will guess correctly, but sometimes it doesn’t which can lead to the value being misinterpreted when sent to the database. This can be avoided using Parameters.Add() and specifying the SqlDbType, this will ensure the data will reach the database in the correct form.​

              When using dates, strings, varchar and nvarchar it is strongly recommended to use Parameters.Add() as there is a possibility of Parameters.AddWithValue() to incorrectly guess the type. 

              ​​​Implementing parameterized queries using Parameters.Add()

              cmd.Parameters.Add("@varcharValue", System.Data.SqlDbType.Varchar, 20).Value = “Text”;

              ​​Figure: Good Example – Using VarChar SqlDbType and specifying a max of 20 characters (-1 for MAX)

              cmd.Parameters.Add("@decimalValue", System.Data.SqlDbType.Decimal, 11, 4).Value = decimalValue;

              ​Figure: Good Example – Using decimal(11,4) SQL Parameter

              ​​cmd.Parameters.Add("@dateTimeValue", System.Data.SqlDbType.DateTime2).Value = DateTime.UtcNow;

              ​Figure: Good Example - C#, VB .NET SQL DateTime Parameter

              ​$SqlCmd.Parameters.Add("@dateTimeValue", [System.Data.SqlDbType]::DateTime2).Value = $dateTime2Value

              ​Figure: Good Example - PowerShell SQL DateTime Parameter

            61. Views - Do you use SQL Views?

              ​​A view is a virtual table produced from executing a stored query on the database. Views don’t store any data as they retrieve the result set through a query. Users can interact with views similarly as they would with a normal table when retrieving data however limitations do exist when writing back to the result-set. Views can be used to simplify access to database result sets and provide more security options to administrators when granting access. More information can be found at CREATE VIEW (Transact-SQL)​

              ​​

              ​Advantages:​

              1. Simplicity

              • Multiple tables can be joined and simplified into a single virtual table.
              • Complex calculations, groupings and filters can be hidden from the user so that the results appear as a simple dataset.
              • Views can be used to transparently partition data such as having Sales2019 and Sales2020 but both views read from the underlying Sales table.
              • Duplication can be reduced in procedures and queries by using a common view.
               
              2. Security
              • Views can be used to provide a specific data set to a user while protecting the underlying tables.
              • Permission management of tables cannot be limited to a row or a column, but it can be implemented simply through views.
               
              3. Flexibility
              • Once the view structure is determined, you can shield the impact of changes in the table structure on users.
              • Modifying the column name of the source table can be solved by modifying the view, without impacting end users.
              • Aliases can be used on column names to make them more readable and descriptive.
              • Adding columns to the source table has no impact on the view.
              • The ability to soft-delete records by filtering them with an IsDeleted column.​

              Disadvantages:
              1. Performance
              • Views can take longer to query than tables as they may contain complex functions or multi-table queries.​

              2. Dependencies
              • ​Views depend on the underlying tables for its data, so if the structure of the tables change it may break the view.
              3. Update Rest​rictions
              • ​Depending on the complexity of the view, you may not be able to modify the data through the view to the underlying base tables.​

              Modifying tables through views

              In some cases, you can update the tables through a SQL view depending on its complexity. 
              You can only update views with a single base table otherwise it may choose the incorrect base table to update.

              ​​INSERT INTO vwProductsNorthwind VALUES (@ItemName, @ItemCost);

              UPDATE vwProductsNorthwind SET Cost = @ItemCost WHERE Id = @ItemId;

              DELETE vwProductsNorthwind WHERE Id = @ItemId;

              Figure: Example of an updatable view using a single base table

              More complex views, such as a multi-table view can be used after the where cla​​use in another update statement.​

              ​-- Create the products by category view
              CREATE VIEW vwProductsByCategory
              AS
              SELECT p.Id, p.Name, p.Cost, p.OnSale, p.CategoryId
              FROM Products p
              JOIN Categories c
              ON p.CategoryId = c.Id

              -- Set all products from a particular category to be on sale
              UPDATE Products
              SET  OnSale = @OnSale
              WHERE Id IN ( SELECT Id FROM  vwProductsByCategory WHERE CategoryName = @CategoryName )

              ​Figure: Using a multi-table view after the when clause

              ​Example Scenario

              So your business has an employees table as shown below that has detailed information about their name, birthdate, home phone, address and photo. This information is suitable for the payroll department but what you want to display employees names and photos on the website for public viewing. Or what If you want contact information such as extension number and country to be available on the company intranet?

              ViewsSqlEmployeesTable.png
              Figure: Northwind traders employees table

              ​You could create separate tables for each department, only supplying the required fields for each. 
              This would also need an additional system to sync between the tables to ensure the information was kept up to date. 

              ViewsSqlTables.png
              Figure: Bad Example – Using tables and duplicating data

              ​​CREATE VIEW  vwIntranetEmployees AS  
              SELECT EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, Country, Extension, Photo, PhotoPath   
              FROM Employees;  

              CREATE VIEW  vwWebsiteProfiles AS  
              SELECT EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, Photo, PhotoPath
              FROM Employees; 

              ​Figure: Good Example – Using views from the base table containing the source data

              ​Creating views of the employee table allows you to update the data in one source location such as payroll and all other departments will see the changes. It prevents the problem of stale data and allows more control over access to the data.