SELECT Statement

In this section:

FROM Clause

Outer Joins

Correlation Names

WHERE Clause

GROUP BY Clause

HAVING Clause

AS Clause

Aggregate Functions

Subqueries

Correlated Subqueries

Generating a Result Set

ORDER BY

Example:

Selecting and Manipulating Values From the Employee Table

Among the most important features of SQL is its ability to query relational data sources and to provide an answer set based on defined criteria. A query can be a:

The SELECT statement is the most powerful and complex SQL statement. The SELECT statement can be used along with INSERT, UPDATE, and DELETE to manipulate data and query the tables in a database. SELECT can be used with a number of clauses and qualifiers, such as FROM, WHERE, DISTINCT, or ALL to define how information will be presented and what information will be retrieved.

DISTINCT
Instructs the system to remove duplicates.

ALL
Instructs the system to ignore duplicates when and if they occur.

query-spec       ::=  SELECT [<sel-qualifier>] <selection> <table-exp>
sel-qualifier ::= ALL | DISTINCT
selection ::= <select-list> | *
select-list ::= <scalar-exp> [, <scalar-exp>]*
table-exp ::= <from-clause> [<where-clause>] [<group-by-clause>]
[<having-clause>]
from-clause ::= FROM <from-ref> [, <from-ref>]*
where-clause ::= WHERE <search-condition>
group-by-clause ::= GROUP BY <column-name> [, <column-name>]*
having-clause ::= HAVING <search-condition>
subquery ::= (SELECT [ALL | DISTINCT] <result-spec> <table-expression>)
result-spec ::= <scalar-exp> | *

Top of page

Example: Selecting and Manipulating Values From the Employee Table

SELECT DEP_NBR, MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEP_NBR
HAVING MAX(SALARY) > 100000

Top of page

FROM Clause

A FROM clause establishes-conceptually-a candidate set, S, from which output tuples are eventually drawn. The FROM clause is defined as:

from-ref         ::= <table-definition> [<range-variable>]
table-definition ::= { <table-ref> | <joined-table> | ( <joined-table>) }
joined-table ::= <table-definition><join-spec><table-definition>
<on-condition>
join-spec ::= { INNER JOIN | LEFT [OUTER] JOIN | RIGHT [OUTER] JOIN }
on-condition ::= ON <search-condition>

If the FROM clause consists of only table T, then the candidate set S is T. If the clause is

FROM T1, T2, ..., Tn

then S is defined to be the Cartesian product T1 x T2 x...x Tn of the tables. The Cartesian product of T1...Tn is the set of all possible tuples formed by concatenating, or pasting together, a row from T1 with a row from T2 with a row from T3, ..., and a row from Tn.


Top of page

Outer Joins

Example:

Defining a Candidate Set as the Join of Component Tables

Describing Join Table Results as a Component of Another Join

A joined table specifies an intermediate result table that is the result of either an inner join or an outer join. The table is derived by applying one of the join operators: INNER, LEFT OUTER, RIGHT OUTER, or FULL OUTER to its operands.

Inner joins can be thought of as the cross product of the tables (combine each row of the left table with every row of the right table), keeping only the rows where the join-condition is true. The result table may be missing rows from either or both of the joined tables. Outer joins include the inner join and preserve these missing rows. There are three types of outer joins:


Top of page

Example: Defining a Candidate Set as the Join of Component Tables

The FROM clause may also define the candidate set as the "join" of component tables, as shown in the following example:

FROM CUSTOMER
FROM CUSTOMER, ORDER, INVENTORY
FROM T1 LEFT OUTER JOIN T2 ON T1.X=T2.Y

For this example, the candidate set S is formed as follows. First, the Cartesian product of the two tables (T1 and T2) is formed. Next, the on condition is used to remove those tuples or rows that do not satisfy the condition. If an INNER JOIN was requested, we would have S. Since, in actuality, LEFT OUTER JOIN was requested, we check if all rows of table T1 (the left table) display. For any rows that do not appear, we add a tuple consisting of a row from T1 with a row consisting of null values from T2. If this a RIGHT OUTER JOIN, we would add tuples to ensure that every row of T2 was represented.

If tables T1 and T2 contained the following data:

T1:     column X            T2:       column Y 
1 3
2 4
3 5

Top of page

Example: Describing Join Table Results as a Component of Another Join

This example shows that the result of a joined table may in turn be a component of another join.

FROM (T1 INNER JOIN T2 ON T1.X=T2.Y)
LEFT OUTER JOIN T3 ON T1.X=T3.Z

Top of page

Correlation Names

An identifier that is a correlation name is associated with a table within a particular scope. The scope of a correlation name is either a select statement: single row, subquery, query specification, or is a trigger definition. Scopes may be nested. In different scopes, the same correlation name may be associated with different tables or with the same table.


Top of page

Example: Defining Correlation Names

FROM clauses are also used to define correlation names. A correlation name, R, can assume the value of a row taken from a single specified table, T (R is said to range over T). Consider the following example:

FROM CUSTOMER C, PERSONNEL P

In this example, C ranges over CUSTOMER and P over PERSONNEL. Whenever a correlation name is introduced in the FROM clause of an SQL subselect, it must be used throughout the request. Thus, in the above example, C should be used in lieu of CUSTOMER within the confines of the select statement containing "FROM CUSTOMER C." Correlation names can be used as abbreviations for long table names. But correlation names offer more than notational convenience.


Top of page

Example: Defining Correlation Names by Joining the Employee Table to Itself

SELECT A.EMP_NBR, A.NAME
FROM EMPLOYEE A, EMPLOYEE B
WHERE A.SALARY > B.SALARY AND A.REPORTS_TO = B.EMP_NBR

In English, this query might read, "Get the names of all employees earning more than their managers." Here, producing the correct response entails joining the employee table to itself. The correlation names, A and B, are required to distinguish one conceptual copy of the employee data source from another.

There are a number of rules governing the use of correlation names in SQL statements:

Note: Catalog and Schema identifiers are permitted, but are ignored by the server Universal Translator. That is, the system treats the following requests as if they were exactly the same because they each contain the same Table name (PERSON).

SELECT FIRST_NAME, LAST_NAME
FROM LONDON.SMITH.PERSON
WHERE SEX = 'F'

SELECT FIRST_NAME, LAST_NAME
FROM BLOGGS.PERSON
WHERE SEX = 'F'

SELECT FIRST_NAME, LAST_NAME
FROM PERSON
WHERE SEX = 'F'

Top of page

WHERE Clause

How to:

Specify a Search Condition

Example:

Specifying a Search Condition

The WHERE clause is part of the SELECT statement that specifies a search criteria. A WHERE clause is a search condition preceded by the keyword WHERE.

where-clause ::= WHERE <search-condition>

The WHERE clause can be qualified when combined with:

WHERE clauses are governed by two rules:


Top of page

Syntax: How to Specify a Search Condition

where-clause  ::=  WHERE <search-condition>

Top of page

Example: Specifying a Search Condition

SELECT DEP_NBR, MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEP_NBR
WHERE MAX(SALARY) = 70000

Top of page

GROUP BY Clause

How to:

Group Table Elements

Example:

Producing a Single Row for Each Department Number in the Employee Table

The GROUP BY clause, which follows the WHERE clause in an SQL query, is used to group the elements of a table into sets. This clause is also used to produce a single row of results for each group of rows that have the same values.

Each column reference in the GROUP BY list must belong to a subset and must refer unambiguously to a column specified in the SELECT list. Columns within views whose values are derived by means of expressions, functions, or constants must never display in a GROUP BY list.

GROUP BY partitions a set into subsets, S1 .. Sn. Each subset corresponds to a fixed set of values assumed by the grouping columns. (Null values are considered to be equal for the purpose of forming groups.)

When GROUP BY is present, the elements in the select list must be single-valued by group. Select list elements can be aggregate functions, they can be columns referenced in the group by list (which by definition are single-valued by group), or they can be expressions. They must not assume multiple values for any group to which they belong.

The GROUP BY clause can be used with:


Top of page

Syntax: How to Group Table Elements

group-by-clause  ::=  GROUP BY <column-name> [, <column-name>]*

where:

GROUP-BY
Groups the elements of a subset together to form a new intermediate result.


Top of page

Example: Producing a Single Row for Each Department Number in the Employee Table

SELECT DEP_NBR, MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEP_NBR

Top of page

HAVING Clause

Example:

Specifying a Qualifying Condition

The HAVING clause is similar to the WHERE clause. WHERE eliminates rows, and the HAVING clause eliminates groups. The HAVING clause and WHERE clause have similar syntax. They both consist of a keyword followed by a search condition.

The HAVING clause applies qualifying conditions to groups after they have been formed. The HAVING clause complements the GROUP BY clause. The HAVING clause allows you to include column functions such as, COUNT, AVG, MAX, MIN, or SUM in the search condition. Each HAVING compares one column or column function expression of the group with another column function expression of the group or with a constant.

HAVING can be used with the logical operators ADD, OR, or NOT to combine conditions.

Expressions in the HAVING clause must be single-valued per group. Each column name in the search condition must reference a grouping column or be a correlated reference.

having-clause ::= HAVING <search-condition>

Top of page

Example: Specifying a Qualifying Condition

SELECT DEP_NBR, MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEP_NBR
HAVING MAX(SALARY) > 100000

Top of page

AS Clause

The AS clause makes it possible to rename existing column titles in your reports.

as-clause ::= AS <column-name>

Top of page

Syntax: How to Change Default Titles

SELECT field1 AS title1, field2 AS title2 

where:

field
Can be a sort field, display field, column total, or row total.

title
Is the new column title lines.


Top of page

Aggregate Functions

Example:

Producing a Count From the Condition "CITY=Denver"

Reference:

Function Names

The SQL Translator provides five aggregate functions: COUNT, SUM, AVG, MAX, and MIN. These aggregate functions can be used with a:

Each is represented by a name followed by a parenthetical expression containing a single argument. Arguments designate result set columns and can be preceded by the keyword DISTINCT to remove duplicates.

Aggregate functions must be coded as follows:

function-ref    ::= COUNT(*) | <distinct-fn-ref> | <all-fn-ref>
distinct-fn-ref ::= [<fn-name> | COUNT] (DISTINCT <column-ref>)
all-fn-ref ::= <fn-name> ( [ALL] <scalar-exp> )
fn-name ::= AVG | MAX | MIN | SUM

Note that COUNT(*) is treated as a special case. COUNT(*) counts the number of rows in a table without eliminating duplicates.

The keyword DISTINCT can be used with any aggregate except COUNT(*).


Top of page

Example: Producing a Count From the Condition "CITY=Denver"

SELECT COUNT(*) FROM CLIENT WHERE CITY = 'Denver'

This would generate a single output element (a table consisting of one row and one column). The result would be produced by counting every row that satisfies the CITY = 'Denver' predicate. It is significant that each row participates in the count. Among the SQL functions, COUNT(*) is unique in that it does not reject an operand because it is null, or because it contains a null value.


Top of page

Reference: Function Names

The following functions operate on the set of non-null scalar values found within one column of a result table. All of these produce single-valued output. A list of function names and their meanings follow:

Function

Description

COUNT([DISTINCT])

The number of (distinct) non-null elements in the column.

SUM([DISTINCT])

The sum of the (distinct) elements in the column. This function can only be used with numeric columns.

AVG([DISTINCT])

The average of the (distinct) elements in the column. This function can only be used with numeric columns.

COUNT

The number of non-null elements in the column.

SUM

The sum of the non-null elements in the column. This function can only be used with numeric columns.

AVG

The average of the non-null elements in the column. This function can only be used with numeric columns.

MAX

The maximum value in the column.

MIN

The minimum value in the column.


Top of page

Subqueries

Subqueries, which access single-column tables from the database, are used within comparison predicates, in predicates, all-or-any predicates, and existence tests. A subquery is a SELECT statement that nests:

subquery    ::= (SELECT [ALL | DISTINCT] <result-spec> <table-expression>)
result-spec ::= <scalar-exp> | *

Top of page

Example: Using Subqueries

SELECT ITEM FROM INVENTORY WHERE UNIT_COST <
(SELECT AVG(UNIT_COST) FROM INVENTORY)

SELECT ITEM FROM INVENTORY WHERE UNIT_COST < ALL
(SELECT UNIT_COST FROM INVENTORY
WHERE ITEM_CODE LIKE 'X%')

SELECT ITEM, DESCRIPTION FROM ORDERS
WHERE ITEM_CODE NOT IN
(SELECT ITEM_CODE FROM INVENTORY)

SELECT A.ITEM, A.DESCRIPTION FROM ORDERS A
WHERE EXISTS
(SELECT * FROM INVENTORY B
WHERE A.ITEM_CODE = B.ITEM_CODE
AND B.QTY_ON_HAND > 1000)

Subselects used within comparison predicates are required to return a single row of output. Subselects used within quantified predicates must return a single column of values, S. If S is empty or if the given comparison is true for all values of S, then x <relop> ALL S is true; it is false otherwise. If x <relop> S is true for at least one value of S, x <relop> ANY S is true.


Top of page

Correlated Subqueries

A correlated subquery is a subquery expressed in terms of a value produced by an outer query. Consider, for example, the fourth SELECT statement in the preceding Using Subqueries example. The subselect in the EXISTS test contains the following comparison:

A.ITEM_CODE = B.ITEM_CODE

A.ITEM comes from the outer select; B.ITEM comes from the inner select. To develop an answer set for this query, the SQL Translator must-conceptually-evaluate the inner query once for every value of A.ITEM produced by the outer select.


Top of page

Generating a Result Set

A join is a query that:


Top of page

Example: Establishing a Join Condition of FACULTY and STUDENT

SELECT F.NAME, S.NAME
FROM FACULTY F, STUDENTS S
WHERE S.ADVISOR = F.EMP_NBR AND F.DEPT = 'Mathematics'
ORDER BY F.NAME

Conceptually, the SQL Translator evaluates this query in a series of steps:

A production quality DBMS would never proceed in this manner because the overhead caused by the size of S1 would preclude that. Still, it is instructive to think of the query execution as taking place in this fashion.

The predicate, S.ADVISOR = F.EMP_NBR, is known as a join condition. Although, in practice, most join operations are based on equal comparisons (and are therefore called equijoins), the SQL Translator supports joins based on other valid SQL relational operators.


Top of page

ORDER BY

The ORDER BY clause is used to sort the answer set by the values contained in one or more columns.

order-by-clause  ::= ORDER BY <ord-spec> [, <ord-spec>]*
ord-spec ::= {<integer> | <column-ref>} [{ASC | DSC}]

Top of page

Example: Sorting a Result Set From Employee Table

SELECT DEP_NBR, SALARY
FROM EMPLOYEE
ORDER BY DEP_NBR, SALARY

iWay Software