17. Wrap-Up: Service-Oriented Database Applications

SQL SERVER 2005 is more than just a new version of SQLServer. It is the integration of relational data and XML data in a way that makes it easier to build secure, reliable, and scalable applications that are easily maintained. Features are geared to support both scale-up and scale-out, and work without application servers or in conjunction with them. In addition, the new features more tightly integrate the data to the kinds of service-oriented distributed applications we are building today and will be building tomorrow.

Lots of New Features: How to Use Them

SQL Server 2005 contains an amazing number of new features. We’ve covered most of them in previous chapters but haven’t even touched on the new features for administration and replication, in Reporting Services, in Analysis Services, and in SQL Server Integration Services (SSIS), including .NET Framework–based libraries for everything. We’ve concentrated on developer topics.

SQL Server core functionality is enhanced with each new release. There are a great number of enhancements based on making SQL Server more secure and reliable. These enhancements include database user and schema separation, improved SQL Server user password handling, new encryption functions and secret storage, system metadata view security, and an in-database managed-code programming model. All these enhancements have been implemented while retaining a high level of reliability and security. This functionality melds well with the declarative model of SQL, implemented with procedural enhancements through Transact-SQL (T-SQL).

An entire new concurrency model that uses versioning instead of locking has been introduced. For those developers who have had difficulty porting or converting their applications to SQL Server because of the differences between locking-based and versioning-based concurrency, the new model goes a long way toward removing this obstacle.

T-SQL and SQL have also been improved, not only with new systemdefined functions and new SQL extensions, but also by adding T-SQL exception handling. Whether you use T-SQL or in-database .NET Framework procedures; client-side ADO.NET and the SqlClient data provider; a COM-based API-like OLE DB or ADO; or ODBC, the SQL-92 standard database API, you should always be using statements that are written in SQL to invoke set-based, data-related functionality. SQL is still your way of processing set-based data, but beyond the set, there is a different story.

Data Models, Programming, and SQL Server

The new functionality in SQL Server 2005 gives us many choices for storing, processing, programming, performing calculations on, and sharing data. In large part because of the ease of sharing data over a well-known protocol, HTTP, in a few well-known formats, the concept of the remote application server—historically used only in monolithic systems—has become a “commodity” way of programming. The application server (and middle tier in the ANSI-SPARC three-tier architecture) has been popularized as the Web server. In addition, over time, three programming and data-related trends have become entrenched:

• Data is stored in databases based on the relational model in most cases.

• XML is used to eliminate architectural dependencies on data representation. It has its own data model for storing text-heavy data (where intrinsic ordering matters) and document replicas.

• Application programmers design and program using object-oriented concepts.

SQL Server 2005 and the libraries in .NET Framework 2.0 give programmers a lot of choices for which paradigm can be used and how it can be programmed. Here are a few examples, for purists of each paradigm.

Relational data can be stored in SQL Server; it’s SQL Server’s native format. It can be accessed through an object-oriented layer based loosely on the SQL-CLI (command language interface) known as ADO.NET, through OLE DB, or through ODBC (the most well-known implementation of SQL-CLI) directly through a proprietary protocol, Tabular Data Stream (TDS).

XMLdata can be stored in SQL Server directly. It can be validated on the server because XML schemas are stored in the server. It can be indexed to make XQuery language queries against it perform efficiently, or it can be searched based on their text content with full-text search. XML and relational data can be joined in a single query. Relational data can also be exposed as XML through T-SQL extensions, and XML messages can be marshaled using HTTP and Web Services for interoperability or using SQL Server Service Broker for the asynchrony and reliability that HTTP lacks.

Any Functionality at Any Tier

The most interesting thing (and the scariest, for those of us who want homogeneous, cut-and-paste solutions) is that this functionality can be deployed across any tier. Using SQL Server and .NET Framework 2.0 library code names, here are some of the ways you can deploy if you divide the world into client, middle tier, and database. Big caveat: These methods are technically possible, but their use, as always, is dictated by the needs of the system.

Here’s what you can accomplish on the database alone:

• You can store XML data in SQL Server using the XML data type.

• You can query relational data in SQL Server as XML using SELECT ... FOR XML and store it using the xml.nodes method, OpenXml, or XML Bulk Insert in T-SQL.

• You can write .NET Framework programs that use System.Xml inside SQL Server to query XML or convert relational data to XML.

• You can produce XML in a stored procedure and transport it using HTTP instead of TDS using HTTP ENDPOINTs.

You can transport binary, text, or XML through asynchronous messaging using Service Broker.

• You can extend SQL Server’s scalar type system with user-defined types (UDTs).

• You can use ADO.NET using System.Data.SqlClient classes inside .NET Framework programs that run inside SQL Server.

• If you don’t use the managed environment, you can still use extended stored procedures written in any unmanaged (unsafe) language and access data using ODBC or OLE DB.

• You can manipulate the data entirely with T-SQL stored procedures and user-defined functions.

Here’s what you can accomplish on the middle tier or client:

• You can fetch and program against any type of data from SQL Server with ADO.NET (SqlClient), OLE DB and ADO, or ODBC.

• You can retrieve data through HTTP or TDS.

• You can work with extended scalar types from SQL Server as though they are .NET Framework types.

• You can bind data directly in relational form, as collections of objects or as XML trees, to graphical user interface components for either Web-based, interoperable consumption (ASP.NET, middle tier) or rich window user-interface consumption (Windows Forms, client).

• You can deploy, through SQL Server Express, an instance of SQL Server containing Service Broker on every client workstation for the ultimate in asynchronous, reliable transaction processing.

SQL Server 2005 and the .NET Framework base class library APIs have accomplished the integration of different models and data.

What Are the Best Practices?

With so many choices, the question that always arises is “What are best practices when the new technological functionality is factored into the equation?” We’ve always had a hard time with the concept of best practices because of the underlying implication that domain problems can fit into neat categories. They can’t. It’s like saying to an architect, “It’s only a building; just build one just like the one you built last year.” Although certain pieces can be prefabricated, the overall structure and look-and-feel depend on the landscape and aesthetics. We can’t start to count the number of hours we’ve spent in meetings listening to software developers argue over “elegant” versus “ugly” designs or technological politics. Any software project must balance the concepts of ease of construction, time constraints, maintainability, and user responsiveness. Our favorite software development story is about a manager who writes the following on the whiteboard at a user/programmer architecture meeting:

Choose Any Two:

• Good

• Fast

• Cheap

During the years when SQL Server 2005 was being developed and applications were being developed and/or enhanced to use the feature set (a good example of this is the “Project REAL” information on the Microsoft Web site at http://www.microsoft.com/sql/solutions/bi/projectreal.mspx), a set of best uses for the new features became clear. Here is an outline of some uses and best practices for each feature, roughly following the book topic order. If the reason for the suggestions here aren’t intuitively obvious from the compact descriptions, refer to the book topic for more details.

SQLCLR safety

• Avoid UNSAFE assemblies if at all possible. If you consider them, understand the repercussions of using UNSAFE.

• Use only base class libraries on the approved list.

SQL and procedural code (T-SQL and SQLCLR)

• Use single SQL statements if at all possible, as opposed to procedural code.

• Use T-SQL for data access.

Use SQLCLR for:

– computation.

– business logic.

– base class library built-ins such as regular expressions.

– string manipulation.

– user-defined functions.

– anything you would use an extended stored procedure for in SQL Server 2000.

• If your procedure contains both data access and business logic, benchmark and test a SQLCLR alternative to your existing T-SQL procedure. The results may surprise you.

SQLCLR versus client-side and middle-tier programming

• Consider using the middle tier as a scale-out strategy. Procedures implemented within the SQLCLR can easily be moved to the middle or client tier.

• Keep the amount of data that must be shipped over the network in mind.

Using SqlTypes or .NET Framework primitive types in SQLCLR code

• UDFs should return null on null input.

• Stored procedures should use SqlTypes if nullability is a concern.

SqlTypes are especially useful for DECIMAL parameters.

Using Visual Studio 2005 automatic deployment

• Use auto-deploy as a convenience during testing if your company standards permit it.

• Don’t forget the limitation of auto-deploy: no ALTER ASSEMBLY.

Custom SQLCLR attributes

• Don’t use DataAccess/SystemDataAccessKind.Read unless you intend data access.

• Use SqlFacet for UDT method return values.

• Your IsPrecise/IsDeterministic must be accurate.

In-process data access

• Do not replicate cursors in CLR.

• Use pipelining in table-valued UDFs (user-defined functions) or returned resultsets if results don’t come from database data.

• Keep the amount of code executed during impersonation small.

UDTs

• Use these only for custom scalars.

• One possible use is data types used for conversion (for example, time duration).

• Remember to mark mutator methods as such using the IsMutator property of the SqlMethod attribute.

• Remember the 8,000-byte limit on UDT state.

• Be aware of issues with:

– the UDT IsNull method.

– SQL Server Management Studio display of NULL UDTs.

• Use Format.Native if at all possible.

• Use regular expressions in your Parse method to make it robust.

• Implement binary input checking.

User-defined aggregates

• Use these for convenience aggregates used in many places in your application.

• Remember the 8,000-byte limit.

• Custom specific functions written in T-SQL usually are faster than a generalized user-defined aggregate.

• There is no guarantee of ordering in this release of SQL Server; don’t use IsInvariantToOrder=false.

Security

• Turn on features only when using them.

• Don’t use DBO schema if possible.

• Schema should be owned by roles.

• Use EXECUTE AS for temporary privilege shift.

• Grant as few permissions as possible.

• Use VIEW DEFINITION permission sparingly.

• Use encryption only when needed.

DecryptByKeyAutoCert simplifies key opening and closing.

• Use ORIGINAL_LOGIN or event for auditing.

Engine

• Prefer the new MAX data types to the deprecated TEXT, NTEXT, and IMAGE.

• Do not use VARCHAR(MAX) for every VARCHAR.

Remember that VARCHAR(MAX) and NVARCHAR(MAX) are different data types from VARCHAR(n) and NVARCHAR(n).

• Don’t assume that all built-in character functions support more than 8,000 bytes.

• Don’t turn on Snapshot Isolation unless you intend to use it.

• Use DDL TRIGGERs to allow cancellation of the triggered operation.

• Use EVENT NOTIFICATIONs if you need:

– asynchronous triggers.

– business logic in triggered code.

– events not supported by DDL TRIGGERs.

• Windows Management Instrumentation (WMI) is a bit faster than Event Notifications for trace events but not as configurable.

• Keep statement-level recompiles in mind as you design your stored procedures; some design limitations in previous versions may be lifted.

• Use query hints only as a last resort.

• Use plan guides unless hints are one-offs.

• Capture plans of existing queries in XML format for plan forcing if things change.

T-SQL

• Prefer TRY/CATCH over @@ERROR.

• Prefer CTEs when using resultset more than once in a single SQL statement.

• Rewrite custom hierarchical code to recursive CTEs.

• Prefer EXCEPT and INTERSECT to equivalent T-SQL code if database transparency is an issue.

• Use TOP UPDATE/DELETE instead of SET ROWCOUNT.

• Prefer OUTPUT when DML produces more than one output variable that you want to return to the client.

• Use the ROW_NUMBER function for paging a large rowset.

XML

• Use the correct data type choice.

• Relational—most often, for traditional application.

• XML

– if order matters.

– to save replicas of service requests.

with ragged hierarchies.

– with semistructured data.

– with markup data.

• Prefer storing many small XML documents to one big document for best performance.

• Use MAX data types instead of the XML data type: – to maintain full-fidelity XML schemas.

– to maintain full-fidelity XML schemas.

– to maintain lexical integrity of XML if, for example, you are storing data to use in an XML editor application.

• In hybrid XML-relational designs:

– hoist often-used properties to computed columns.

– use decomposition/composition between XML and relational.

• Use XML schemas only if you need validation; there is extra overhead.

• Use consistent workarounds for XML schema limitations

– change lax to skip in xsd:Any productions.

• Use BULK provider to get XML from file.

– SINGLE_BLOB is preferred over SINGLE_CLOB/NCLOB unless encoding matters.

XQuery

• Use indexing unless you don’t often query your XML.

– The primary XML index is almost a requirement if doing lots of querying.

– Use secondary XML indexes if needed for your use cases … la relational index strategies.

– DROP/REBUILD indexes for mass inserts.

• Remember repercussions of strong type checking.

• Schemas ease singleton issues and type checking but don’t provide much performance gain for queries.

• Use sql:variable to make up for lack of XQuery variables.

• Use T-SQL loops to make up for one-node limit in XML modify method.

• Remember that XML modify is a mutator; using it with NULLs causes an error.

Service Broker

• Service Broker is

– a replacement for transactional MSMQ applications, not all MSMQ applications.

– a scale-out facilitator.

– a more robust Web Services implementation.

– a replacement for tables used as queues.

• Service Broker is not

– platform independent.

– WS-* compliant.

– a built-in workflow engine; Service Broker CONTRACTs are not workflow.

• Use Service Broker on a SQL Express instance on clients for robust broker client input.

• Turn off XML schema validation of messages if not required.

• Use specific routes rather than rely on wildcard routes.

• Use activation procedures even on initiators; they can receive End-Dialog and Error messages.

• Consider XML messages to avoid CAST on RECEIVE.

• Consider XML messages even for text to avoid encoding issues.

Web Services

• Use them in a SQL Server instance with data to communicate with legacy systems.

• Use them in a SQL Server instance without having data in that instance to pipeline input from outside the firewall to Service Broker application.

• Use a custom WSDL generator procedure for strong typing of Web Service output.

Clients

• SQL Native Client

– Use the new providers in SQL Native Client after testing existing applications.

– The new providers are needed only for new SQL Server 2005 features.

– SQL Native Client must be deployed to every client where it is used.

SQL Native Client is not supported on Windows 9x clients.

– Not all netlibs are supported.

– Use version 8.0 data types for backward compatibility.

• ADO.NET 2.0

– is backward compatible with older versions of ADO.NET.

– is needed only for new SQL Server 2005 features.

– requires .NET Framework 2.0 on every client or middle tier.

• UDTs and clients

– UDTs should be used as .NET Framework objects if using ADO.NET, but beware that some Visual Studio 2005 components, such as the typed DataSet generator, don’t support them.

– UDTs should be used as string or XML format if using OLE DB, ODBC, or JDBC.

• XML

– Use as string or as SqlXml class if using ADO.NET.

– Use as string or stream (OLE DB only) if not using ADO.NET.

– In XML processing, client versus server:

– Validation always happens on the server.

– Validating on the client may save database round trips.

– Use XQuery on the server to cut down network traffic.

– XML should be on database for management robustness.

• Large data on file system versus database:

– Putting large data on the database may allow management robustness.

– Data APIs allow streamed output but not streamed input of large data.

New SQL Server client features

• Don’t use MARS for cursor behavior.

• Remember that serialized multiple-rowset access is not guaranteed with MARS.

• Query Notifications

– Don’t use for dynamic data.

– Don’t use for many subscribers.

– Use parameterized queries.

– Realize that you can get notifications for reasons other than “rowset changed.”

Wrap in cache sync for ASP.NET.

– Use cache initialization on multiple SQL Express instances for Web Farms.

System.Transactions

– Remember that default isolation level is serializable.

– Promotion happens with a second instance of “same” connection.

– Transactions using System.Transactions are not composed with “outer transaction” with in proc ADO.NET.

ADO.NET 2.0

• Use connection strings in configuration files.

• Use the “using” language construct for Connections and Commands.

• Remember that databases aren’t generic, even though data access is in ADO.NET 2.0.

• Generic access is needed only if supporting more than one database or changing databases is anticipated.

• In asynchronous commands:

– Only one statement executes at a time per connection.

– Gratuitous multithreading slows things.

– The asynchronous command should be used to implement a data access progress bar.

– Asynchronous commands are useful for accessing multiple data sources in parallel.

– Asynchronous commands are not useful for Windows Forms; use the AsyncWorker class instead.

– Don’t use asynchronous keyword in connection string if you don’t use it.

• Prefer BULK INSERT on host over BulkCopy (or BCP).

• Prefer the SqlBulkCopy class for nonfile bulk insert.

SMO

• Get only the properties you need; libraries enforce this.

• Prefer SMO to ADO.NET schemas for SQL Server (only).

• Use singletons (or XPath) rather than collections for metadata access.

SQLNS as an application development framework

• Use if you have a large numbers of events and many consumers.

• Use SQLAgent or SQLMail if you don’t have large numbers of events and many consumers.

• Prefer SQLNS over writing your own notification application.

• Use SQLNS in an existing database if you have a likely candidate database.

• Prototype your applications using SQL statements and the file provider before adding providers.

• Use multicast if all notifications are the same.

• Use Reporting Services for rich notification formatting.

• Don’t use user filters if you have a discrete set of related event types.

Toward a Service-Oriented Database Architecture

Since the time that we went from monolithic mainframe computers to distributed systems, some of the main challenges of software programmers have revolved around locality of reference and speed of transport. That is, if all the software is not in one place, and we don’t have a direct wire to the user, it matters not only how fast things go inside the computer, but also how long it takes to get the data from here to there and how many round trips we make. If we have the world’s fastest supercomputer that can do calculations at near light speed, but if it takes five calls to access it and one second per call, why use it if our personal computer can calculate the same result in two seconds?

Many of the choices we’ve shown are locality-of-reference choices. Whether it’s useful to transform columns and rows to objects on the server or middle tier depends on how many round trips are made and how long they take—or how powerful the server hardware is on the machine that’s running the database. Whether they should be transformed to objects at all is based on the premise that object-oriented design saves money in development and maintainability.

Another big consideration is interoperability. It’s more a consideration for applications used for sales and marketing that must have the widest possible reach. It certainly makes it easier to add new business partners if your company is not large enough to dictate the terms of intercompany communication. XML is the technology used today to ease interoperability. Because of its structured nature and strong typing, it’s a great improvement over the comma-separated text files of the past. The same features make it a choice for marshaling data to unlike platforms.

As an overly simplistic generalization, for a line-of business application, we’d start with the following:

• Data stored in SQL Server in relational format

• XML used for database replicas, as a transport format, and for interoperability

• Object-oriented programming concepts used on the client

It’s actually better and more scalable to divide out the data into categories and determine best configuration based on data usage, concurrency requirements, caching/replications, and the requirement to share the data with business partners (the outside world). But that’s beyond our current focus. This is with line-of-business online transaction processing (OLTP) applications; applications that store, index, and query documents would be designed differently.

Relational databases have been around for more than 20 years. They are a known quantity. Most programmers whom we’ve run into know SQL; the rest at least know what it is. There are domains in which user-defined types are very useful, such as time sequence, spatial data, and multimedia data. In addition, we may achieve synergies in terms of code reuse and programmer productivity by storing objects—the same objects we use on the middle and client tiers—in the database. In the current implementation of UDTs, however, unless the UDT is binary ordered, any SELECT type of operation must fetch the object’s binary representation from disk, materialize it, and in turn operate on it. That’s the equivalent of a query without an optimizable search argument, a non-SARGable query.

XML data storage and query are in their infancy. It is interesting that except for full-text applications, the preferred/fastest way to store XML documents is to decompose the data. It sounds rather relational—data decomposition, that is. XQuery has a lot of promise, just as SQL had a lot of promise (but was slow) in 1980. It will be interesting to see what happens when vendors have a few years to optimize XQuery processors.

Inside the database, using a managed environment for procedures that extend the capabilities of T-SQL in domain-specific ways is a big improvement over having every programmer use a low-level, type-unsafe language. The idea of .NET Framework stored procedures replacing extended stored procedures—for those who can’t wait for Microsoft to produce everything they want as extensions to T-SQL or as system-defined functions—is an appealing one. Likewise, until T-SQL stored procedures can be compiled, processor-intensive calculations that should be done in the server should use compiled .NET Framework code. Data-intensive routines probably should still be done in T-SQL until access through the .NET Framework data provider in the database catches up, but the fact that there is an in-database provider makes it a more difficult decision.

Object-oriented programming concepts are here to stay. Whether you like a thin veneer of object orientation over a traditional data API, … la ADO.NET, or want to treat your data as objects, ignoring the underlying persistence format, you’ll be programming with objects in your middle tier and client tier. If you’re a fan of relating objects to databases, a new language integrated query model (LINQ) is in development at Microsoft at this writing. It features an extensible common metaquery language that can be implemented over relational databases (DLINQ) or XML data (XLINQ). Direct data binding, both to the DataSet and to the XPathDocument, makes using ADO.NET or System.Xml without having to change your data into domain-specific classes more viable. After all, the two most ubiquitous user interface paradigms are the grid (columns and rows) and the tree (data-centric XML). If you have a lot of domain-specific logic, it makes sense to use a domain-specific object model. Object-oriented programming used on the middle tier (many times, just a Web server), combined with its optimizations and possibly used in conjunction with asynchronous client notifications where real-time cache coherency is required, is a viable middle-tier model. XML-relational mapping, combined with an XML-data-centric, object-oriented, .NET Framework programming language, is a possibility looming on the horizon. Such an XML-data-centric language would make storing XML in its native format more appealing as well.

Interoperability among computer hardware and software vendors has been a problem since the second computer vendor appeared. Each vendor used slightly different hardware, and software was built to take advantage of the byte-order and floating-point-number format for speed. In addition, these differences sifted up into network protocols. To transmit fewer bytes in vendor-specific format using proprietary protocols is a laudable goal. As computers and networks have gotten faster, the search for the universal format and protocol (at the expense of speed and packet size) has quickened. Distributed systems, including current distributed object systems and remote procedure protocols, have always used binary. DCOM/RPC and IIOP are both binary based. Common XML-based formats are supplanting legacy comma-separated-text data exchange formats and TCP/IP network protocols. For maximum interoperability, you can marshal data between unlike platforms in XML format. Even when XML is used as a format and protocol, however, there are still cases of “multiple standards” that are chances for mismatch. Two of the current examples are the competing ebXML and SOAP protocols and document-literal versus RPC-encoded Web Service message format.

For the best chance at interoperability today, you can expose your data through Web Services. Whether the XML packets are served out of the database directly (making the database a combined storage system and application server) or from a separate application server (such as IIS or COM+) depends on your network configuration. If you know that all your users will have the SQL Server client libraries installed (they have been installed on every Windows-family OS machine for a while), TDS is still your best choice. All the APIs, such as ADO.NET, OLE DB, and ODBC, use TDS.

The Database As Part of the Platform

With SQL Server 2005, some new features integrate the database into the application development platform. SQL Server Notification Services, in addition to being an application framework, enables easier integration between the “home office” and mobile devices. SQL Server Service Broker brings asynchronous messaging—including reliable in-order delivery of complex domain-specific multimessage exchanges, client notifications, and messaging in XML format—into the database server. In addition, direct storage of XML and user-defined types mean that processing can be moved closer to the server.

HTTP endpoints make SQL Server a SOAP server for database data and HTTP-based Web Services as well. Reporting Services allow you to store and generate reports directly inside the database. Support of user-defined types and the possible support of a stream-based/file-based data type in the future enable SQL Server to become a multimedia server as well, and stream support means that SQL Server is becoming more integrated with the Windows file system. Look for more integration of file system, database storage, and query technologies in the future.

The Windows platform consists of operating systems for server, user, and mobile devices, with integrated graphical user interfaces; tools for consumers, office workers, and programmers; and an underlying development environment to permit building applications. Software vendors seem to be moving toward high-level, safe, intermediate language and execution engine–based APIs to allow their software architectures to adapt quickly to different hardware environments, like the plethora of mobile devices. In addition, there is a trend to ubiquitous information, available anywhere, either through a central repository or sent to the appropriate user at her desk or mobile device, but secure, administered, backed up, and internally consistent at the same time. SQL Server 2005 is a central adapter and facilitator as the platform evolves.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.1.239