Learning T-SQL (Transact-SQL) - Unit 1

Transact-SQL

Microsoft database systems such as SQL Server, Azure SQL Database, Azure Synapse Analytics, and others use a dialect of SQL called Transact-SQL, or T-SQL. T-SQL includes language extensions for writing stored procedures and functions, which are application code that is stored in the database, and managing user accounts.

SQL is a declarative language :

Programming languages can be categorized as procedural or declarative. Procedural languages enable you to define a sequence of instructions that the computer follows to perform a task. Declarative languages enable you to describe the output you want, and leave the details of the steps required to produce the output to the execution engine.

  • SQL supports some procedural syntax, but querying data with SQL usually follows declarative semantics. 
  • Use SQL to describe the results you want, and the database engine's query processor develops a query plan to retrieve it. The query processor uses statistics about the data in the database and indexes that are defined on the tables to come up with a good query plan.
A database is a container that holds tables and other SQL structures related to those tables. 
The information inside the database is organized into tables.
A table is the structure inside your database that contains data, organized in columns and rows.
A column is a piece of data stored by your table. A row is a single set of columns that describe attributes of a single thing. 
In other words each table is an entityThe attributes of these entities are called as the columns and each row(tuple) in the table represents an instance of the entity type.




In SQL Server database systems, tables are defined within schemas to create logical namespaces in the database.




  • Data Manipulation Language (DML) is the set of SQL statements that focuses on querying and modifying data. DML statements include SELECT, the primary focus of this training, and modification statements such as INSERT, UPDATE, and DELETE.
  • Data Definition Language (DDL) is the set of SQL statements that handles the definition and life cycle of database objects, such as tables, views, and procedures. DDL includes statements such as CREATE, ALTER, and DROP.
  • Data Control Language (DCL) is the set of SQL statements used to manage security permissions for users and objects. DCL includes statements such as GRANT, REVOKE, and DENY.


Consider the following query -

SELECT OrderDate, COUNT(OrderID) AS Orders
FROM Sales.SalesOrder
WHERE Status = 'Shipped'
GROUP BY OrderDate
HAVING COUNT(OrderID) > 1
ORDER BY OrderDate DESC;

Order of Execution :
  1. The FROM clause is evaluated first, to provide the source rows for the rest of the statement. A virtual table is created and passed to the next step.
  2. The WHERE clause is next to be evaluated, filtering those rows from the source table that match a predicate. The filtered virtual table is passed to the next step.
  3. GROUP BY is next, organizing the rows in the virtual table according to unique values found in the GROUP BY list. A new virtual table is created, containing the list of groups, and is passed to the next step. From this point in the flow of operations, only columns in the GROUP BY list or aggregate functions may be referenced by other elements.
  4. The HAVING clause is evaluated next, filtering out entire groups based on its predicate. The virtual table created in step 3 is filtered and passed to the next step.
  5. The SELECT clause finally executes, determining which columns will appear in the query results. Because the SELECT clause is evaluated after the other steps, any column aliases (in our example, Orders) created there cannot be used in the GROUP BY or HAVING clause.
  6. The ORDER BY clause is the last to execute, sorting the rows as determined by its column list.
Why Select * should not be used in production -
  • Changes to the table that add or rearrange columns will be reflected in the query results, which may result in unexpected output for applications or reports that use the query.
  • Returning data that is not needed can slow down your queries and cause performance issues if the source table contains a large number of rows.
Instead go for  Selecting specific columns -

An explicit column list allows you to have control over exactly which columns are returned and in which order. Each column in the result will have the name of the column as the header.

Example -

SELECT ProductID, Name, ListPrice, StandardCost
‎FROM Production.Product;

A SELECT clause can perform calculations and manipulations, which use operators to combine columns and values or multiple columns. The result of the calculation or manipulation must be a single-valued (scalar) result that will appear in the result as a separate column.

Example -

SELECT ProductID,
      Name + '(' + ProductNumber + ')',
  ListPrice - StandardCost
FROM Production.Product;

Specifying column aliases for each column returned by the SELECT query, either as an alternative to the source column name or to assign a name to the output of an expression.

Example -

SELECT ProductID AS ID,
      Name + '(' + ProductNumber + ')' AS ProductName,
  ListPrice - StandardCost AS Markup
FROM Production.Product;

Formatting queries

  • Capitalize T-SQL keywords, like SELECT, FROM, AS, and so on. Capitalizing keywords is a commonly used convention that makes it easier to find each clause of a complex statement.
  • Start a new line for each major clause of a statement.
  • If the SELECT list contains more than a few columns, expressions, or aliases, consider listing each column on its own line.
  • Indent lines containing subclauses or columns to make it clear which code belongs to each major clause.
The following table shows common data types supported in a SQL Server database.


Data Type conversion
  • Compatible data type values can be implicitly converted as required
  • Some cases require to explicitly convert values from one data type to another - for example, trying to use + to concatenate a varchar value and a decimal value will result in an error, thus it is required to first convert the numeric value to a compatible string data type.
  • T-SQL includes functions to help you explicitly convert between data types
  • CAST and TRY_CAST
  • The CAST function converts a value to a specified data type if the value is compatible with the target data type. An error will be returned if incompatible.
        Example -
            SELECT CAST(ProductID AS varchar(4)) + ': ' + Name AS ProductName
            FROM Production.Product;

            Here query uses CAST to convert the integer values in the ProductID column to varchar values (with a                    maximum of 4 characters) in order to concatenate them with another character-based value


          Similar to this is the CONVERT function in T-SQL
     SELECT CONVERT(varchar(4), ProductID) + ': ' + Name AS ProductName
     FROM Production.Product;

  • TRY_CAST takes the value passed to it and tries to convert it to the specified data_type. If the cast succeeds, TRY_CAST returns the value as the specified data_type; if an error occurs, null is returned. However if you request a conversion that is explicitly not permitted, then TRY_CAST fails with an error.
            Example -
              SELECT TRY_CAST(Size AS integer) As NumericSize
                FROM Production.Product; 

             Similar to this is the TRY_CONVERT function in T-SQL
          
            SELECT SellStartDate,
               CONVERT(varchar(20), SellStartDate) AS StartDate,
               CONVERT(varchar(10), SellStartDate, 101) AS FormattedStartDate 
               FROM SalesLT.Product;
  • The PARSE function is designed to convert formatted strings that represent numeric or date/time values. For example, consider the following query (which uses literal values rather than values from columns in a table) 
  • Example -
      SELECT PARSE('01/01/2021' AS date) AS DateValue,
    PARSE('$199.99' AS money) AS MoneyValue;

  DateValue - 2021-01-01T00:00:00.0000000 , MoneyValue - 199.99

  • The STR function converts a numeric value to a varchar.
            SELECT ProductID,  '$' + STR(ListPrice) AS Price
            FROM Production.Product;


NULLS - A NULL value means no value or unknown. It does not mean zero or blank, or even an empty string

ISNULL  - The ISNULL function takes two arguments. The first is an expression we are testing. If the value of that first argument is NULL, the function returns the second argument. If the first expression is not null, it is returned unchanged.

Example -

SELECT FirstName,
ISNULL(MiddleName, 'None') AS MiddleName, LastName
FROM Sales.Customer;


COALESCE

The ISNULL function is not ANSI standard, so COALESCE function can be used. COALESCE is a little more flexible is that it can take a variable number of arguments, each of which is an expression. It will return the first expression in the list that is not NULL. 


SELECT EmployeeID,
COALESCE(HourlyRate * 40,
WeeklySalary,
Commission * SalesQty) AS WeeklyEarnings
FROM HR.Wages;


NULLIF

The NULLIF function allows you to return NULL under certain conditions. NULLIF takes two arguments and returns NULL if they're equivalent. If they aren't equal, NULLIF returns the first argument.

In this example, NULLIF replaces a discount of 0 with a NULL. It returns the discount value if it is not 0:


SELECT SalesOrderID,ProductID, UnitPrice,
NULLIF(UnitPriceDiscount, 0) AS Discount
FROM Sales.SalesOrderDetail;



ORDER BY clause 
SELECT<select_list>
FROM <table_source>
ORDER BY <order_by_list> [ASC|DESC];

ORDER BY can take several types of elements in its list:
Columns by name
Column aliases
Columns by ordinal position in the SELECT list
Columns not included in the SELECT list, but available from tables listed in the FROM clause - If the query uses a DISTINCT option, any columns in the ORDER BY list must be included in the SELECT list.

ASC for ascending (A-Z, 0-9) or DESC for descending (Z-A, 9-0). Ascending sorts are the default.


Limit the sorted results -  Using the TOP clause
Example -

SELECT TOP 10 Name, ListPrice
FROM Production.Product
ORDER BY ListPrice DESC;

In addition to specifying a fixed number of rows to be returned, the TOP keyword also accepts the WITH TIES option, which will retrieve any rows with values that might be found in the selected top N rows.

SELECT TOP 10 WITH TIES Name, ListPrice
FROM Production.Product
ORDER BY ListPrice DESC;

To return a percentage of the eligible rows, use the PERCENT option with TOP instead of a fixed number. This can be used with 'WITH TIES' option as mentioned above.

SELECT TOP 10 PERCENT Name, ListPrice
FROM SalesLT.Product
ORDER BY ListPrice DESC;

An extension to the ORDER BY clause called OFFSET-FETCH enables you to return only a range of the rows selected by your query.

SELECT ProductID, ProductName, ListPrice
FROM Production.Product
ORDER BY ListPrice DESC 
OFFSET 0 ROWS --Skip zero rows
FETCH NEXT 10 ROWS ONLY; --Get the next 10

SELECT ProductID, ProductName, ListPrice
FROM Production.Product
ORDER BY ListPrice DESC 
OFFSET 10 ROWS --Skip 10 rows
FETCH NEXT 10 ROWS ONLY; --Get the next 10


The DISTINCT keyword removes duplicates if any -

SELECT DISTINCT City, CountryRegion
FROM Production.Supplier
ORDER BY CountryRegion, City;

The structure of the WHERE clause

The WHERE clause is made up of one or more search conditions, each of which must evaluate to TRUE, FALSE, or 'unknown' for each row of the table. Rows will only be returned when the WHERE clause evaluates as TRUE. The individual conditions act as filters on the data, and are referred to as 'predicates'. Each predicate includes a condition that is being tested, usually using the basic comparison operators:

  • = (equals)
  • <> (not equals)
  • > (greater than)
  • >= (greater than or equal to)
  • < (less than)
  • <= (less than or equal to)

IS NULL / IS NOT NULL

Filter to allow or exclude the 'unknown' or NULL values using IS NULL or IS NOT NULL.


SELECT ProductCategoryID AS Category, ProductName
FROM Production.Product
WHERE ProductName IS NOT NULL;


The IN operator is a shortcut for multiple equality conditions for the same column connected with OR.

SELECT ProductCategoryID AS Category, ProductName
FROM Production.Product
WHERE ProductCategoryID IN (2, 3, 4);

BETWEEN is another shortcut that can be used when filtering for an upper and lower bound for the value instead of using two conditions with the AND operator. The BETWEEN operator uses inclusive boundary values.

SELECT ProductCategoryID AS Category, ProductName
FROM Production.Product
WHERE ListPrice BETWEEN 1.00 AND 10.00;

LIKEallows to use wildcard characters and regular expression patterns. 

SELECT Name, ListPrice
FROM SalesLT.Product
WHERE Name LIKE '%mountain%';

SELECT Name, ListPriceFROM SalesLT.ProductWHERE ProductNumber LIKE 'FR-_[0-9][0-9]_-[0-9][0-9]';


List price of products whose product number begins BK- followed by any character other than R, and ends with a - followed by any two numerals

SELECT ProductNumber, Name, ListPriceFROM SalesLT.ProductWHERE ProductNumber LIKE 'BK-[^R]%-[0-9][0-9]';

Comments