Wednesday, November 30, 2011

Using Views in Microsoft SQL Server




Views are a very useful feature of relational technology in general, and Microsoft SQL Server specifically. They are wonderful tools that ease data access and system development when used prudently. Furthermore, views are simple to create and implement. But unfortunately most users do not adhere to a systematic and logical approach to view creation. This causes the advantages of views to become muddled and misunderstood. As you read this article you will find that views are very useful when implemented wisely, but can be an administrative burden if implemented without planning.

View Overview
Before discussing how best to implement views, let's review the basics of views. All operations on a SQL Server table result in another table. This is a requirement of the relational model and is referred to as relational closure.

A view is basically the relational model's way of turning a SELECT statement into a "table" that is accessible using SQL. Therefore, a view can be considered a logical table. No physical structure is required of a view; it is a representation of data that is stored in other tables. The data "in the view" is not stored anywhere and only physically exists in the underlying tables. Views can also be based on other views. For clarification, refer to Figure 1.

Figure 1. View Composition

 
Views are very flexible. They can consist of any combination of the following:

  • Rows from tables, including: a subset of rows from a single table, all rows from a single table, a subset of rows from multiple tables, or all rows from multiple tables.
  • Rows from views, including the same combinations as listed above for tables.
  • Columns from tables, including: a subset of columns from a single table, all columns from a single table, a subset of columns from multiple tables, or all columns from multiple tables.
  • Columns from views including the same combinations as listed above for tables.
Views are defined using SQL and are represented internally to SQL Server by a SELECT statement, not by stored data. The SQL comprising the view is executed only when the view is accessed and views can be accessed by SQL in the same way that tables are — by SQL.

When modifying data through a view (that is, using INSERT or UPDATE statements) certain limitations exist depending upon the type of view. Views that access multiple tables can only modify one of the tables in the view. Views that use functions, specify DISTINCT, or utilize the GROUP BY clause may not be updated. Additionally, inserting data is prohibited for the following types of views:

  • views having columns with derived (i.e., computed) data in the SELECT-list
  • views that do not contain all columns defined as NOT NULL from the tables from which they were defined
It is also possible to insert or update data through a view such that the data is no longer accessible via that view, unless the WITH CHECK OPTION has been specified.

Almost any SQL that can be issued natively can be coded into a view; there are exceptions, however. For example, the UNION operator can not be used in a view and you cannot create a trigger on a view.

All of the basic DDL statements can be used to create and manage views including CREATE, DROP, and ALTER. The text of any view can be retrieved from the SQL Server system catalog using the system procedure sp_helptext (unless the view was created specifying WITH ENCRYPTION). For example, this statement:

sp_helptext Sample_view

Might return the following output:
 
TextCREATE VIEW Sample_View
AS SELECT title, au_fname, au_lname
FROM titles, titleauthor, authors
WHERE titles.title_id=titleauthor.title_id
AND authors.author_id=titleauthor.author_id


It is also possible to rename a view using the system procedure sp_rename.

View Implementation Rules
Understanding the basic features of views will provide a framework for you to develop rules governing view usage. This section provides the rules of thumb for when it is wise to create views. Guidelines such as these should be instituted within your organization to reduce the amount of time and effort required to implement effective views.

The following rules will ensure that views are created in a responsible and useful manner at your shop. These rules were developed over a number of years as a result of implementing large relational databases in many different environments. There may be more uses for views than are presented here, so do not needlessly fret if you do not see your favorite use for views covered in this article—unless you blindly use base table views. There is no adequate rationale for enforcing a strict rule of one—view per base table for SQL Server application systems. In fact, the evidence supports not using views in this manner.

There are three basic view implementation rules:

  • The View Usage Rule
  • The Proliferation Avoidance Rule
  • The View Synchronization Rule
These rules define the parameters for efficient and useful view creation. Following them will result in a shop implementing views that are effective, minimize resource consumption, and have a stated, long-lasting purpose.

The View Usage Rule
The first rule is the view usage rule. Simply stated, your view creation strategy should be goal-oriented. Views should be created only when they achieve a specific, reasonable goal. Each view should have a specific application or business requirement that it fulfills before it is created. That requirement should be documented somewhere, preferably in a data dictionary or repository.

Although this rule seems obvious, views are implemented at some shops without much thought as to how they will be used. This can cause the number of views that must be supported and maintained to continually expand until so many views exist that it is impossible to categorize their uses.

There are seven primary uses for which views excel. These are:

  1. to provide row and column level security
  2. to ensure efficient access paths
  3. to mask complexity from the user
  4. to ensure proper data derivation
  5. to provide domain support
  6. to rename columns, and
  7. to provide solutions which can not be accomplished without views
Let's examine each of these uses.

Security
One of the most beneficial purposes served by views is to extend the data security features of SQL Server. Views can be created that provide a subset of rows, a subset of columns, or a subset of both rows and columns from the base table.

How do views help provide row and column level security? Consider an EMPLOYEE table that contains all of the pertinent information regarding an enterprise's employees. Typically, name, address, position, birth date, hire date, and salary information would be contained in such a table. However, not every user will require access to all of this information. Specifically, it may become necessary to shield the salary information from most users. This can be done by creating a view that does not contain the salary column and granting most users the ability to access the view, instead of the base table.

Similarly, row level security may be necessary. Consider a table that contains project information. Typically this would include project name, purpose, start date, and who is responsible for the project. Perhaps the security requirements of the projects within your organization deem that only the employee who is responsible for the project can access their project data. By storing the login id of the responsible employee in the PROJECT table, a view can be created using the USER special register such as the one shown below:
 

CREATE VIEW MY_PROJECTS
 (PROJ_NO, PROJ_NAME, DEPT_NO,
PROJ_STAFF, PROJ_START_DATE,
PROJ_END_DATE)
ASSELECT PROJNO, PROJNAME, DEPTNO,
PRSTAFF, PR_STDATE, PR_ENDATE
FROMPROJECT_TABLE
WHERERESPONSIBLE_EMP=USER_NAME()

The USER_NAME() function returns the login id of the user initiating the request. So, if user Sammy issues a SELECT statement against the MY_PROJECTS view, only rows where the RESPONSIBLE_EMP is equal to "Sammy" will be returned. This is a fast and effective way of instituting row level security.

By eliminating restricted columns from the SELECT list and providing the proper predicates in the WHERE clause, views can be created to allow access to only those portions of a table that each user is permitted to access.

Efficient Access
Views can also be used to ensure optimal access paths. By coding efficient predicates in the view definition SQL, efficient access to the underlying base tables can be guaranteed. The use of proper join criteria and predicates on indexed columns can be coded into the view. For example, consider the following view:
 

CREATE VIEW EMP_DEPTS
 (EMP_NO, EMP-_FIRST_NAME, EMP_MID_INIT,
EMP_LAST_NAME, DEPT_NO, DEPT_NAME)
ASSELECT E.EMPNO, E.FIRSTNME, E.MIDINIT,
E.LASTNAME, D.DEPTNO, D.DEPTNAME
FROM EMP_TABLEE,
DEPT_TABLED


WHERE D.DEPTNO=E.WORKDEPT
By coding the appropriate join criteria into the view definition SQL you can ensure that the correct join predicate will always be utilized.

Complexity
Somewhat akin to coding appropriate access into views, complex SQL can be coded into views to mask the complexity from the user. This can be extremely useful when your shop employs novice SQL users (whether those users are programmers, analysts, managers, or typical end users).

Consider a database that tracks projects and each individual activity for the project. The following rather complex SQL implements relational division:
   
SELECT DISTINCT PROJNO
FROMPROJACT P1
WHERENOT EXISTS
(SELECT ACTNO
FROMACT A
WHERENOT EXISTS  
(SELECTPROJNO
FROMPROJACT   P2
WHEREP1.PROJNO=P2.PROJNO
ANDA.ACTNO=P2.ACTNO);
  
This query uses correlated subselects to return a list of all projects in the PROJACT table that require every activity listed in the ACT table. By coding this SQL into a view called, say ALL_ACTIVITY_PROJ, then the end user need only issue the following simple SELECT statement instead of the more complicated query:
 
SELECT PROJNO
FROM ALL_ACTIVTY_PROJ
  
Now isn't that a lot simpler?
Derived Data
Another valid usage of views is to ensure consistent derived data by creating new columns for views that are based upon arithmetic formulae. For example, creating a view that contains a column named TOTAL_COMPENSATION which is defined by selecting SALARY + COMMISSION + BONUS is a good example of using derived data in a view.

Domain Support
It is a sad fact of life that there are no commercial relational database management systems that support domains; and SQL Server is no exception. Domains are an instrumental component of the relational model and, in fact, were in the original relational model published in 1970 — almost 30 years ago! Although the purpose of this article is not to explain the concept of domains, a quick explanation is in order. A domain basically identifies the valid range of values that a column can contain.
1

Some of the functionality of domains can be implemented using views and the WITH CHECK OPTION clause. The WITH CHECK OPTION clause ensures the update integrity of SQL Server views. This will guarantee that all data inserted or updated using the view will adhere to the view specification. For example, consider the following view:
 

CREATE VIEW EMPLOYEE
 (EMP_NO, EMP_FIRST_NAME, EMP_MID_INIT,
EMP_LAST_NAME, DEPT, JOB, SEX, SALARY)
AS SELECT EMPNO, FIRSTNME, MIDINIT, LASTNAME,
WORKDEPT, JOB, SEX, SALARY
FROMEMP
WHERESEX IN ('M', 'F')
WITH CHECK OPTION;
   
The WITH CHECK OPTION clause, in this case, ensures that all updates made to this view can specify only the values 'M' or 'F' in the SEX column. Although this is a simplistic example, it is easy to extrapolate from this example. Your organization might create views with predicates that specify code ranges using BETWEEN, patterns using LIKE, and a subselect against another table to identify the domain of a column.

A word of caution however: when inserts or updates are done using these types of views, SQL Server will evaluate the predicates to ensure that the data modification conforms to the predicates in the view. Be sure to perform adequate testing prior to implementing domains in this manner to be safeguard against possible performance degradation.

Column Renaming
As you can tell from looking at the sample views shown in the other sections, you can rename columns in views. This is particularly useful if a table contains arcane or complicated column names.

Consider the following view:
 
CREATE VIEW ACCOUNT
 (ACCT_NUMBER, CUST_NUMBER, ACCT_TYPE,
ACCT_BALANCE, ACCT_STATUS, ACCT_OPEN_DATE)
AS
 SELECT ACNO, CUNO, ACTP, ACBL,
ACST, ACOD
FROMACCT

Not only have we renamed the entity from ACCT to the more easily understood name, ACCOUNT, but we have also renamed each of the columns. Isn't it much easier to understand ACCT_TYPE than ACTP, or ACCT_OPEN_DATE than ACOD? Of course, this is a contrived example with very awkward column names, but I have seen worse in actual production systems. Whenever tables exist with clumsy table and/or column names, views can provide an elegant solution to renaming without having to drop and recreate anything.

Single Solution Views
The final view usage situation might actually be the most practical usage for views—when views are the only solution! Sometimes, a complex data access request may be encountered that can not be coded using SQL alone. But, sometimes a view can be created to implement a portion of the access. Then, the view can be queried to satisfy the remainder.

Consider the scenario where you want to report on detail information and summary information from a single table. For instance, what if you would need to report on account balance information from an ACCOUNT table similar to the one discussed previously? For each account, provide all column details, and on each row, also report the maximum, minimum, and average balance for that customer. Additionally, report the difference between the average balance and each individual balance. Try doing that in one SQL statement!

Instead, you could create a view to solve the dilemma. Consider the following view, BALANCE that creates the maximum, minimum, and average balances by customer:
 
CREATE VIEW BALANCE
 (CUST_NUMBER, MAX_BALANCE,
MIN_BALANCE, AVG_BALANCE)
AS
 SELECT CUST_NUMBER, MAX(BALANCE),
MIN(BALANCE), AVG(BALANCE)
FROMACCOUNT
GROUP BYCUST_NUMBER
  
After the view is created, the following SELECT statement can be issued joining the view to the base table, thereby providing both detail and aggregate information on each report row:
SELECT A.CUST_NUMBER, A.ACCT_NUMBER, A.BALANCE,
MAX_BALANCE, MIN_BALANCE, AVG_BALANCE,
A.BALANCE - AVG_BALANCE
FROMACCOUNT A,
BALANCE B
WHEREA.ACCT_NUMBER = B.ACCOUNT_NUMBER
ORDER BY1, 3
  
Situations such as these are a great opportunity for using views to make data access a much simpler proposition.

The Proliferation Avoidance Rule
The second overall rule is the proliferation avoidance rule. It is simple to state and directly to the point: do not needlessly create SQL Server objects that are not necessary.

Whenever a SQL Server object is created additional entries are placed in the system catalog tables. Creating needless views (and, indeed any object), causes what I call catalog clutter — entries in the catalog for objects which are not needed or not used.

In terms of views, for every unnecessary view that is created SQL Server will insert rows into the following system catalog tables: syscolumns, syscomments, sysdepends, sysobjects, sysprocedures, and sysprotects. If uncontrolled view creation is permitted, disk usage will increase, I/O problems can occur, and inefficient catalog organization may result.

The proliferation avoidance rule is based on common sense. Why create something that is not needed? It just takes up space that could be used for something that is needed.

The View Synchronization Rule
The third, and final view implementation rule is the view synchronization rule. The basic intention of this rule is to ensure that views are kept in sync with the base tables upon which they are based.

Whenever a change is made to a base table, all views that are dependent upon that base table should be analyzed to determine if the change would impact them. All views should remain logically pure. The view was created for a specific reason (see the View Usage Rule above). The view should therefore remain useful for that specific reason. This can be accomplished only by ensuring that all subsequent changes that are pertinent to a specified usage are made to all views that satisfy that usage.

For example, say a view was created to satisfy an access usage, such as the EMP_DEPTS view discussed earlier. The view was created to provide information about employees and their departments. If a column is added to the EMP table specifying the employee's social security number, it should also be added to the EMP_DEPT view if it is pertinent to that view's specific use. Of course, the column can be added to the table immediately and to the view at the earliest convenience of the development team.

The synchronization rule requires that strict change impact analysis procedures be in place. Every change to a base table should trigger the usage of these procedures. Simple SQL queries can be created to assist in the change impact analysis. These queries should pinpoint ad hoc queries, application programs, and analytical queries that could be using views affected by the specific changes to be implemented.

View synchronization is needed to support the view usage rule. By keeping views in sync with table changes the original purpose of the view is maintained.

View Naming Conventions
Views also instigate another area of conflict within the world of SQL Server—that being how to name views. Remember, a view is a logical table. It consists of rows and columns, exactly the same as any other table. A SQL Server view can (syntactically) be used in SQL SELECT, UPDATE, DELETE, and INSERT statements in the same way that a table can. Furthermore, a view can be used functionally the same as a table (with certain limitations on updating as outlined in earlier). Therefore, it stands to reason that views should utilize the same naming conventions as are used for tables.

End users querying views need not know whether they are accessing a view or a table. That is the whole purpose of views. Why then, enforce an arbitrary naming standard, such as putting a V in the first or last position of a view name, on views? DBAs and technical analysts, those individuals who have a need to differentiate between tables and views, can utilize the system catalog or system procedures to determine which objects are views and which objects are tables.

Most users do not care whether they are using a table or a view. They simply want to access the data. And, in a relational database, tables and views all logically appear to be identical to the end user: collections of rows and columns. Although there are certain operations that can not be performed on certain types of views, users who need to know this will generally be sophisticated users. For example, very few shops allow end users to update any table they want using a query tool (like Forest & Trees or Crystal Reports). Updates, deletions, and insertions (the operations that are not available to some views) are generally coded into application programs to be executed. Most end users need to query tables dynamically. Now you tell me, which name will your typical end user remember more readily when he needs to access his marketing contacts: MKT_CONTACT or VMKTCT01?

Do Not Create One View Per Base Table
Often times the dubious recommendation is made to create one view for each base table in an application. This is what I call The Big View Myth. The reasoning behind The Big View Myth revolves around the desire to insulate application programs from database changes. Some "experts" believe this insulation can be achieved by mandating that all programs are written to access views instead of base tables. When a change is made to the base table, the programs do not need to be modified because they access a view—not the base table.

Although this sounds like a good idea in principle, indiscriminate view creation should be avoided. The implementation of database changes requires scrupulous analysis regardless of whether views or base tables are used by your applications. Consider the simplest type of database change—adding a column to a table. If you do not add the column to the view, no programs can access that column unless another view is created that contains that column. But if you create a new view every time you add a new column it will not take long for your environment to be swamped with views. Even more troublesome is which view should be used by which program? Similar arguments can be made for removing columns, renaming tables and columns, combining tables, and splitting tables.

In general, if you follow good SQL Server programming practices, you will usually not encounter situations where the usage of views initially would have helped program/data isolation anyway. By dispelling The Big View Myth you will decrease the administrative burden of creating and maintaining an avalanche of base table views.

Always Specify Column Names
When creating views SQL Server provides the option of specifying new column names for the view or defaulting to the same column names as the underlying base table(s). It is always advisable to explicitly specify view column names instead of allowing them to default, even if using the same names as the underlying base tables. This will provide for more accurate documentation and easier view maintenance.

Code SQL Statements in Block Style
All SQL within each view definition should be coded in block style. As an aside, this standard should apply not only to views but also to all SQL statements. Follow these guidelines for coding the SELECT component of your views: 
  • Code keywords such as SELECT, WHERE, FROM, and ORDER BY such that they stand off and always begin at the far left of a new line.
  • Use parentheses where appropriate to clarify the intent of the SQL statement.
  • Use indentation to show the different levels within the WHERE clause.
Do Not Encrypt Views
The WITH ENCRYPTION clause can be specified on SQL Server views to encrypt the actual text of the view. When this clause is specified, you can not retrieve the actual SQL used for the view from the system catalog. This is true whether you use sp_helptext or issue a SELECT statement against syscomments.

For this reason, unless there is a very compelling security reason, avoid using the WITH ENCRYPTION clause. Encrypted views are very difficult to administer, maintain, and modify.

Synopsis
Views are practical and helpful when implemented in a systematic and thoughtful manner. Hopefully this article has provided you with some food for thought pertaining to how views are implemented at your shop. And if you follow the guidelines contained in this article, in the end, all that will remain is a beautiful view!


No comments: