Queries using the exists function. Using the EXISTS Operator

SQL allows you to nest queries within each other. Typically, a subquery returns a single value, which is tested to see if the predicate is true.

Types of search conditions:
. Comparison with subquery result (=, >=)
. Checking if the results of a subquery belong (IN)
. Existence test (EXISTS)
. Multiple (quantitative) comparison (ANY, ALL)

Notes on nested queries:
. A subquery must select only one column (except for a subquery with an EXISTS predicate), and its result data type must match the data type of the value specified in the predicate.
. In some cases, you can use the DISTINCT keyword to ensure that you get a single value.
. You cannot include an ORDER BY and UNION clause in a subquery.
. A subquery can appear either to the left or to the right of the search term.
. Subqueries can use aggregation functions without a GROUP BY clause that automatically return a special value for any number of rows, a special IN predicate, and expressions based on columns.
. Whenever possible, JOIN table joins should be used instead of subqueries.

Examples for nested queries:

SELECT * FROM Orders WHERE SNum=(SELECT SNum FROM SalesPeople WHERE SName='Motika')
SELECT * FROM Orders WHERE SNum IN (SELECT SNum FROM SalesPeople WHERE City='London')
SELECT * FROM Orders WHERE SNum=(SELECT DISTINCT SNum FROM Orders WHERE CNum=2001)
SELECT * FROM Orders WHERE Amt>(SELECT AVG(Amt) FROM Orders WHERE Odate=10/04/1990)
SELECT * FROM Customer WHERE CNum=(SELECT SNum+1000 FROM SalesPeople WHERE SName='Serres')

2) Related subqueries

In SQL, you can create subqueries with a table reference from an outer query. In this case, the subquery is executed multiple times, once for each table row from the outer query. So it's important that the subquery uses an index. The subquery can refer to the same table as the outer one. If the outer query returns a relatively small number of rows, then the associated subquery will perform faster than the unrelated one. If the subquery returns a small number of rows, then the associated query will run slower than the unrelated one.

Examples for related subqueries:

SELECT * FROM SalesPeople Main WHERE 1(SELECT AVG(Amt) FROM Orders O2 WHERE O2.CNum=O1.CNum) //returns all orders greater than the average order value for this customer

3) EXISTS predicate

Syntax form: EXISTS ()

The predicate takes the subquery as an argument and evaluates to true if the subquery has an output, otherwise it evaluates to false. The subquery is executed once and may contain multiple columns, since their values are not checked, but the result of the presence of rows is simply fixed.

Notes on the EXISTS predicate:
. EXISTS is a predicate that returns TRUE or FALSE and can be used alone or with other boolean expressions.
. EXISTS cannot use aggregate functions in its subquery.
. In correlated (related, dependent - Correlated) subqueries, the EXISTS predicate is executed for each row of the outer table.
. You can combine the EXISTS predicate with table joins.

Examples for the EXISTS predicate:

SELECT * FROM Customer WHERE EXISTS(SELECT * FROM Customer WHERE City='San Jose') - returns all customers if any of them live in San Jose.
SELECT DISTINCT SNum FROM Customer First WHERE NOT EXISTS (SELECT * FROM Customer Send WHERE Send.SNum=First.SNum AND Send.CNumFirst.CNum) - Returns the numbers of salespeople who served only one customer.
SELECT DISTINCT F.SNum, SName, F.City FROM SalesPeople F, Customer S WHERE EXISTS (SELECT * FROM Customer T WHERE S.SNum=T.SNum AND S.CNumT.CNum AND F.SNum=S.SNum) - returns numbers, names and cities of residence of all sellers who served several buyers.
SELECT * FROM SalesPeople Frst WHERE EXISTS (SELECT * FROM Customer Send WHERE Frst.SNum=Send.SNum AND 1

4) Quantitative Comparison Predicates

Syntax form: (=|>|=|) ANY|ALL ()

These predicates use a subquery as an argument, however, compared to the EXISTS predicate, they are used in conjunction with relational predicates (=,>=). In this sense, they are similar to the IN predicate, but only apply to subqueries. The standard allows using the SOME keyword instead of ANY, but not all DBMSs support it.

Notes on comparison predicates:
. The ALL predicate evaluates to TRUE if each value selected during the execution of the subquery satisfies the condition specified in the outer query predicate. Most often it is used with inequalities.
. The ANY predicate evaluates to TRUE if at least one value selected during the execution of the subquery satisfies the condition specified in the outer query predicate. Most often it is used with inequalities.
. If the subquery does not return rows, then ALL automatically takes the value TRUE (it is considered that the comparison condition is met), and for ANY - FALSE.
. If the comparison is not TRUE for any row and there are one or more null rows, then ANY returns UNKNOWN.
. If the comparison is not FALSE for any row and there are one or more null rows, then ALL returns UNKNOWN.

Examples for the quantitative comparison predicate:

SELECT * FROM SalesPeople WHERE City=ANY(SELECT City FROM Customer)
SELECT * FROM Orders WHERE Amt ALL(SELECT Rating FROM Customer WHERE City='Rome')

5) Uniqueness predicate

UNIQUE|DISTINCT ()

The predicate is used to check the uniqueness (absence of duplicates) in the subquery output. Moreover, in the UNIQUT predicate, rows with NULL values are considered unique, and in the DISTINCT predicate, two null values are considered equal to each other.

6) Match Predicate

MATCH ()

The MATCH predicate tests whether the value of a query string will match the value of any string returned from the subquery. This subquery differs from the IN and ANY predicates in that it allows you to process "partial" (PARTIAL) matches that may occur among rows that have part of NULL values.

7) Requests in the FROM section

In fact, it is acceptable to use a subquery wherever a table reference is allowed.

SELECT CName, Tot_Amt FROM Customer, (SELECT CNum, SUM(Amt) AS Tot_Amt FROM Orders GROUP BY CNum) WHERE City='London' AND Customer.CNum=Orders.CNum
//subquery returns the total amount of orders placed by each buyer from London.

8) Recursive Queries

WITH RECURSIVE
Q1 AS SELECT … FROM … WHERE …
Q2 AS SELECT … FROM … WHERE …

In addition, you can apply EXISTS operator. This operator is often used with correlated subqueries to test whether the value retrieved by the outer query exists in the result set retrieved by the inner query. If the subquery returns at least one row, the operator returns TRUE . If the value does not exist, the operator returns FALSE . Accordingly, using the NOT EXISTS operator, it is checked that the value retrieved by the outer query does not exist in the result set retrieved by the inner query.

The EXISTS operator checks for the existence of rows in the result set of a subquery.

If the subquery string value is found:

the search does not continue in the inner query;

the condition is marked as TRUE .

If the subquery string value is not found:

the condition is marked as FALSE ;

the search continues in the inner query.

EXISTS operator

All logical operators work with nested SELECT statements. Alternatively, you can use the EXISTS operator. This operator is often used with correlated subqueries to test whether the value retrieved by the outer query exists in the result set retrieved by the inner query. If the subquery returns at least one row, the operator returns TRUE . If the value does not exist, the operator returns FALSE . Accordingly, using the NOT EXISTS operator, it is checked that the value retrieved by the outer query does not exist in the result set retrieved by the inner query.

Novosibirsk State Academy of Economics and Management

LABORATORY WORKSHOP ON THE DISCIPLINE

"DATABASE"

Laboratory work N 7

"SQL database language: data manipulation commands»

NOVOSIBIRSK 2000

SQL is the short name for Structured Query Language. From the name of the language it is clear that its main purpose is to generate requests for information from the database. Commands for data selection form the basis of the data manipulation language DML - an integral part of the SQL language. However, DML consists of more than just commands to fetch data from a database. There are also commands for data modification, data management, and others.

This lab covers the basics of the DML language. We will adhere to the SQL2 standard throughout the lab.

Due to the fact that SQL is a voluminous language, we will consider only the main commands. Various specific SQL features are covered in subsequent labs.

To perform laboratory work, you need to know the basics of the relational data model, the basics of relational algebra and relational calculus, and the principles of working with the MS SQL Server DBMS.

As a result of the laboratory work, you will learn how to manipulate data using SQL commands, consider the dialect of the language implemented in the MS SQL Server DBMS.

INTRODUCTION

SQL contains a wide range of data manipulation capabilities, both for creating queries and for updating the database. These capabilities rely only on the logical structure of the database and not on its physical structure, which is consistent with the requirements of the relational model.

Initially, the syntax structure of SQL was based (or at least appeared to be based) on Codd's relational calculus. Union was the only supported relational algebra operation.

In SQL2, in addition to the similar relational calculus syntax developed in the previous standard, the operations union, intersection, difference, and join are directly implemented. The selection, projection, and product operations were (and continue to be) supported almost directly, while the division and assignment operations are supported in a more cumbersome form.

We will first describe the SQL query language and then its data entry and modification operations. Data change operations will be described last, as their structure relies to some extent on the structure of the query language.

Simple Queries

For us simple request there will be a query that accesses only one database table. Simple queries will help us illustrate the basic structure of SQL.

Simple request. A query that accesses only one database table.

Inquiry: Who works as plasterers?

WHERE SKILL_TYPE = "Plasterer"

Result:

G. Rickover

This query illustrates the three most common phrases SQL: SELECT, FROM and WHERE. Although we put them on different lines in our example, they can all be on the same line. They can also be indented differently, and words within phrases can be separated by an arbitrary number of spaces. Consider the characteristics of each phrase.

Select. The SELECT clause lists the columns that should be included in the resulting table. These are always columns of some relational table. In our example, the resulting table consists of one column (NAME), but in general it can contain several columns; it can also contain computed values or constants. We will give examples of each of these options. If the resulting table should contain more than one column, then all required columns are listed after the SELECT command, separated by commas. For example, the phrase SELECT WORKER_ID, NAME will result in a table consisting of the WORKER_ID and NAME columns.

SELECT phrase. Specifies the columns of the resulting table.

From. The FROM clause specifies one or more tables to be accessed by the query. All columns listed in the SELECT and WHERE clauses must exist in one of the tables listed in the FROM clause. In SQL2, these tables can be directly defined in the schema as base tables or data views, or they can themselves be unnamed tables resulting from SQL queries. In the latter case, the query is explicitly given in the FROM command.

Phrase FROM. Specifies the existing tables that the query refers to.

Where. The WHERE clause contains a condition. on the basis of which the rows of the table (tables) are selected. In our example, the condition is that the SKILL_TYPE column must contain the constant "Plasterer" enclosed in apostrophes, as is always done with text constants in SQL. The WHERE clause is the most volatile SQL command; it can contain many different conditions. Much of our presentation will be devoted to illustrating the various constructs allowed in the WHERE clause.

WHERE clause. Specifies the condition based on which rows are selected from the specified tables.

The above SQL query is processed by the system in the following order: FROM, WHERE, SELECT. That is, the rows of the table specified in the FROM command are placed in the work area for processing. The WHERE clause is then applied to each row in sequence. All rows that do not satisfy the WHERE clause are excluded from consideration. Then those rows that satisfy the WHERE clause are processed by the SELECT command. In our example, NAME is selected from each such row, and all selected values are displayed as query results.

Inquiry: Provide all data on office buildings.

WHERE TYPE = "Office"

Result:

BLDG IDADDRESSTYPEQLTY LEVELSTATUS

312 Vyazov St., 123 Office 2 2

210 Berezovaya st. 1011 Office Z 1

111 Osinovaya st. 1213 Office 4 1

An asterisk (*) in a SELECT command means "the whole line". This is a handy shorthand that we will use frequently.

Inquiry: What is the weekly salary of each electrician?

SELECT NAME, "Weekly salary = ", 40 * HRLY_RATE

WHERE SKILL_TYPE = "Electrician"

Result:

M. Faraday Weekly salary = 500.00

H. Columbus Weekly salary = 620.00

This query illustrates the use of both character constants (in our example, "Weekly salary = ") and calculations in a SELECT command. Within a SELECT command, you can perform calculations that use numeric columns and numeric constants, as well as standard arithmetic operators (+, -, *, /), grouped as needed with parentheses. We've also included a new ORDER BY command that sorts the query result in ascending alphanumeric order by the specified column. If you want to sort the results in descending order, then you need to add DESC to the command. The ORDER BY clause can sort the results by multiple columns, some in ascending order, others in descending order. The sort primary key column is specified first.

character constant. A constant consisting of letters, numbers, and "special" characters.

Inquiry: Who has an hourly rate of $10 to $12?

WHERE HRLY_RATE >= 10 AND HRLY_RATE< - 12

Result:

WORKER ID NAME HRLY_RATE SKILL_TYPE SUPV_ID

This query illustrates some of the advanced features of the WHERE command: comparison operators and the Boolean AND operation. The six comparison operators (=,<>(not equal),<, >, <=, >=). Boolean operators AND (AND), OR (OR) and NOT (HE) can be used to create compound conditions or to negate a condition. Parentheses can be used to group conditions, as is common in programming languages.

Comparison operators =,<>, <, >, <=, >=.

Boolean operations AND (AND), OR (OR) and NOT (HE) .

You could also use the BETWEEN (between) operator to formulate this query:

WHERE HRLY_RATE BETWEEN 10 AND 12

BETWEEN can be used to compare some value with two other values, the first of which is less than the second, if the compared value can be equal to each of the given values or any value in between.

Query: List plasterers, roofers and electricians.

WHERE SKILL_TYPE IN ("Plasterer", "Roofer", "Electrician")

Result:

WORKER_ID NAME HRLY_RATE SKILL_TYPE SUPV_ID

1412 C. Nemo 13.75 Plasterer 1520

2920 R. Garrett 10.00 Roofer 2920

1520 G. Rickover 11.75 Plasterer 1520

This query explains the use of the IN (B) comparison operator. The WHERE condition is true if the row's specialty type is within the bracketed set, that is, if the specialty type is plasterer, roofer, or electrician. We'll see the IN operator again in subqueries.

Let's assume that we can't exactly remember the spelling of the specialty: "electrician" or "electronic engineer" or something else. Wildcard characters that substitute for undefined character strings make it easier to find inaccurate spellings in a query.

Template symbols. Characters that replace undefined character strings.

Inquiry: List workers whose job type begins with Elek.

WHERE SKILL_TYPE LIKE ("Elek%")

Result:

WORKER ID NAME HRLY_RATE SKILL_TYPE SUPV_ID

1235 M.Faraday 12.50 Electrician 1311

1311 H. Columbus 15.50 Electrician 1311

There are two wildcard characters in SQL: % (percentage) and _ (underscore). The underscore replaces exactly one undefined character. The percentage replaces an arbitrary number of characters, starting from zero. When wildcard characters are used, the LIKE operator (like) is required to compare character variables with constants. Other examples:

NAME LIKE "__Columbus"

NAME LIKE "__K%"

The condition in the first example is true if NAME consists of two characters followed by "Columbus". In the WORKER table, all names start with a first initial and a dot. Thus, with this condition, we find all workers with the surname "Columbus". The condition of the second example allows you to find all employees whose last names begin with the letter "K".

Inquiry: Find all jobs that start within the next two weeks.

WHERE START _DATE BETWEEN CURRENT_DATE AND

Result:(Assume current date is CURRENT DATE = 10.10)

WORKER_ID BLDG_ID START_DATE NUM_DAYS

1235 312 10.10 5

1235 515 17.10 22

3231 111 10.10 8

1412 435 15.10 15

3231 312 24.10 20

1311 460 23.10 24

This query illustrates the use of the BETWEEN (between) operator with values of type date (date) and interval (interval). CURRENT_DATE is a function that always returns today's date. Expression

CURRENT_DATE + INTERVAL "14" DAY

adds a two-week span to the current date. Thus, ASSIGNMENT is selected (assuming today is 10/10) if its START_DATE column value is between 10/10 and 10/24. This shows that we can add interval values to date fields. Moreover, we can multiply gap values by integer values. For example, suppose we want to find out what the number of weeks will be in a given number of weeks (denoted by the NUM_WEEKS variable). We can do it like this:

CURRENT_DATE + INTERVAL "7" DAY * NUM_WEEKS

2. Multi-table queries

The ability to link data elements across the boundaries of the same table is important in any database language. In relational algebra, this function is performed by the join operation. Although much of SQL is based directly on the relational calculus, SQL links data from different tables in the same way that the relational algebra join operation does. We will now show how this is done. Consider the query:

Inquiry:

The data required for the response is in two tables: WORKER and ASSIGNMENT. The SQL solution requires listing both tables in the FROM clause and specifying a special type of WHERE clause:

SELECT SKILL_TYPE

FROM WORKER, ASSIGNMENT

WHERE WORKER.WORKER_ID = ASSIGNMENT.WORKER_ID

AND BLDG_ID = 435

What's going on here? We have to consider two stages in the processing of a given request by the system.

1. As usual, the FROM clause is processed first. However, in this case, since two tables are specified in the command, the system creates a Cartesian product of the rows of these tables. This means that one (logically) large table is created, consisting of the columns of both tables, in which each row of one table is paired with each row of the other table. In our example, since the WORKER table has five columns and the ASSIGNMENT table has four columns, the Cartesian product generated by the FROM command will have nine columns. The total number of rows of the Cartesian product is m * n, where m is the number of rows in the WORKER table; and n is the number of rows in the ASSIGNMENT table. Since there are 7 rows in the WORKER table and 19 rows in the ASSIGNMENT table, the Cartesian product will contain 7x19 or 133 rows. If more than two tables are listed in the FROM command, then the Cartesian product of all the tables specified in the command is created.

Cartesian product. The result of joining each row of one table with each row in another table.

2. After creating the giant relational table, the system still applies the WHERE clause as before. Each row of the table created by the FROM command. is checked against the WHERE condition. Rows that do not meet the condition are excluded from consideration. The SELECT clause is then applied to the remaining rows.

The WHERE clause in our query contains two conditions:

1. WORKER. WORKER_ID = ASSIGNMENT.WORKER_ID

2.BLDG_ID = 435

The first of these conditions is the join condition. Note that since both the WORKER and ASSIGNMENT tables contain a column named WORKER_ID, their Cartesian product will contain two columns with that name. To distinguish between them, we precede the column name with the name of the source table, separating it with a dot.

The first condition means that in any selected row, the value of the WORKER_ID column from the WORKER table must match the value of the WORKER_ID column from the ASSIGNMENT table. We are actually joining two tables by WORKER_ID. All rows in which the values of these two columns are not equal are excluded from the product table. Exactly the same thing happens when performing the natural join operation of a relational algebra. (However, there is still some difference from a natural join: SQL does not automatically remove the extra WORKER_ID column). The full join of these two tables with the additional condition BLDG_ID = 435 is shown in fig. 1. Applying the SELECT command will eventually give the following query result:

SKILL TYPE

Plasterer

Roofer

Electrician

Rice. 1. Joining the WORKER and ASSIGNMENT tables

Now we'll show you how to join a table to itself in SQL.

Inquiry: List the employees with the names of their managers.

SELECT A.WORKER_NAME, B.WORKER_NAME

FROM WORKER A, WORKER B

WHERE B.WORKER_ID = A.SUPV_ID

The FROM clause in this example creates two "copies" of the WORKER table, giving them the aliases A and B. The alias is an alternate name given to the table. Then the copies A and B of the WORKER table are connected with the WHERE command based on the condition that WORKER_ID in B and SUPV_ID in A are equal. Thus, each row from A is joined to row B containing information about row manager A (Fig. 2).

Rice. 2. Joining two copies of the WORKER table

Selecting two employee names from each line, we get the required list:

A.NAMEB.NAME

M.Faraday H.Columbus

C.Nemo G.Rickover R.Garrett R.Garrett

P. Mason P. Mason G. Rickover G. Rickover H. Columbus H. Columbus J. Barrister P. Mason

Alias. Alternative name given to the table.

A.WORKER_NAME represents a worker and B.WORKER_NAME represents a manager. Please note that some workers are their own managers, which follows from the equality WORKER_ID - SUPV_ID performed in their lines.

In SQL, you can link more than two tables at a time:

Inquiry

SELECT WORKER_NAME

FROM WORKER, ASSIGNMENT, BUILDING

WHERE WORKER.WORKER_ID = ASSIGNMENT.WORKER_ID AND ASSIGNMENT.BLDG_ID = BUILDING.BLDG_ID AND

TYPE = "Office"

Result:

M. Faraday

G. Rickover

J.Barrister

Note that if a column name (for example, WORKER_ID or BLDG_ID) occurs in more than one table, then to avoid ambiguity, we must precede the column name with the name of the source table. But if the column name occurs in only one table, like TYPE in our example, then there is no ambiguity, so the table name does not need to be specified.

The SQL commands in this query create one table from three relational database tables. The first two tables are joined by WORKER_ID, after which the third table is joined by BLDG_ID to the resulting table. Condition

TYPE = "Office"

the WHERE command results in the exclusion of all rows except for the rows related to office buildings. This matches the requirements of the request.

3. Subqueries

Subquery. Request within a request

A subquery can be placed in the WHERE clause of a query, thereby expanding the capabilities of the WHERE clause. Consider an example.

Inquiry: What are the specialties of the workers assigned to building 435?

SELECT SKTLL_TYPE

FROM WORKER WHERE WORKER_ID IN

(SELECT WORKER_ID

WHERE BLDG_ID = 435)

Subquery in this example

(SELECT WORKER_ID

WHERE BLDG_ID = 435)

A query that contains a subquery is called external request or main request. The subquery results in the following set of worker ID(s):

WORKER ID

external request. The main query, which contains all subqueries.

This set of IDs then takes the place of a subquery in the outer query. From this point on, the outer query is executed using the set created by the subquery. The outer query processes each row of the WORKER table according to the WHERE clause. If the WORKER_ID of the row is in the (IN) set created by the subquery, then the SKILL_TYPE of the row is selected and displayed in the resulting table:

SKILL TYPE

Plasterer

Roofer

Electrician

It is very important that the SELECT clause of the subquery contains the WORKER_ID and only the WORKER_ID. Otherwise, the WHERE clause of the outer query, meaning that the WORKER_ID is in the set of worker IDs, would not make sense.

Note that the subquery can logically execute before even one row is considered by the main query. In a sense, the subquery is independent of the main query. It can be executed as a complete query. We say that such a subquery is not correlated with the main query. As we will see shortly, subqueries can be correlated.

Uncorrelated subquery. A subquery whose value does not depend on any outer query.

Here is an example of a subquery inside a subquery.

Inquiry: List workers assigned to office buildings.

Again we consider the query with which we learned the connection.

SELECT WORKER_MAME

WHERE WORKER_ID IN

(SELECT WORKER_ID

WHERE BLDG_ID IN

WHERE TYPE = "Office"))

Result:

M. Faraday

G. Rickover

J.Barrister

Note that we don't need to precede the column names anywhere with the table names, since each subquery processes one and only one table, so there can be no ambiguity.

The query is executed in order from the inside out. That is, the innermost query (or "bottommost") is executed first, then the subquery containing it is executed, and then the outer query.

Correlated Subqueries. All of the subqueries discussed above were independent of the main queries in which they were used. By independence, we mean that subqueries can be executed on their own as complete queries. We now turn to a class of subqueries whose results may depend on the row considered by the main query. Such subqueries are called correlated subqueries.

Correlated subquery. A subquery whose result depends on the row considered by the main query.

Inquiry: List employees whose hourly rates are higher than those of their managers.

SELECT WORKER_NAME

WHERE A.HRLY_RATE >

(SELECT B.HRLY_RATE

WHERE B.WORKER_ID = A.SUPV_ID)

Result:

The logical steps for executing this query are:

1. The system creates two copies of the WORKER table: copy A and copy B. According to the way we defined them, A refers to the worker, B to the manager.

2. The system then looks at each row A. This row is selected if it satisfies the WHERE clause. This condition means that a row will be selected if its HRLY_RATE value is greater than the HRLY_RATE generated by the subquery.

3. The subquery selects an HRLY_RATE value from row B whose WORKER_ID is equal to the SUPV_ID of row A currently being considered by the main query. This is the manager's HRLY_RATE.

Note that since A.HRLY_RATE can only be compared against one value, the subquery must return only one value. This value varies depending on which row A is being considered. Thus, the subquery correlates with the main query. We'll see more examples of correlated subqueries later when we explore built-in functions.

EXISTS and NOT EXISTS operators

Suppose we want to identify workers who are not assigned to work on a building. On the surface, it seems that such a request can easily be made by simply negating the affirmative version of the request. Suppose, for example, that we are interested in a building with BLDG_ID 435. Consider the query:

SELECT WORKER_ID

WHERE BLDG_ID NOT 435

Unfortunately, this is the wrong wording of the solution. The query will simply give us the IDs of workers working on other buildings. Obviously, some of them may also be assigned to building 435.

The well-formulated solution uses the NOT EXISTS operator (does not exist):

SELECT WORKER_ID

WHERE NOT EXISTS

WHERE ASSIGNMENT.WORKER_ID = WORKER.WORKER_ID AND

Result:

WORKER_ID

The EXISTS and NOT EXISTS operators are always placed before the subquery. EXISTS evaluates to true if the set generated by the subquery is not empty. If the set generated by the subquery is empty, then EXISTS is false. The NOT EXISTS operator, of course, works exactly the opposite. It is true if the result of the subquery is empty, and false otherwise.

EXISTS operator. Returns true if the result set is not empty.

NOT EXISTS operator. Returns true if the result set is empty.

In this example, we have used the NOT EXISTS operator. The subquery selects all rows of the ASSIGNMENT table where the WORKER_ID is the same as the row considered by the main query and BLDG_ID is 435. If this set is empty, then the row of the worker considered by the main query is selected, because this means that this worker does not work at building 435.

In our solution, using a correlated subquery. If we use the IN operator instead of NOT EXISTS, we can get by with an uncorrelated subquery:

SELECT WORKER_ID

WHERE WORKER_ID NOT IN

(SELECT WORKER_ID

WHERE BLDG_ID = 435)

This solution is simpler than the solution with the NOT EXISTS operator. A natural question arises why we need EXISTS and NOT EXISTS at all. The answer is that NOT EXISTS is the only way to resolve queries that contain the word "each" in the condition. Such queries are solved in relational algebra using the division operation, and in relational calculus - using the universal quantifier. Here is an example of a query that contains the word "each" in the condition:

Inquiry: List the workers assigned to each building.

This question can be implemented in SQL using double negation. We will reformulate the query to include the double negative:

Inquiry: List the employees for whom not there is a building to which they are not assigned.

We have highlighted the double negative. It is clear that this query is logically equivalent to the previous one.

Now we want to formulate a solution in SQL. In order to make it easier to understand the final solution, we will first give a solution to a preliminary problem: the problem of identifying all the buildings for which the hypothetical worker, "1234" not assigned.

(I) SELECT BLDG_ID

WHERE NOT EXISTS

ASSIGNMENT.WORKER_ID = 1234)

We've tagged this query with an (I) as we'll refer to it later. If there is no building that satisfies this request, then worker 1234 is assigned to each building and therefore satisfies the conditions of the original request. In order to solve the original query, we must generalize the query (I) from a specific worker 1234 to the WORKER_ID variable and turn this modified query into a subquery of the larger query. Here's a solution:

(II) SELECT WORKER_ID

WHERE NOT EXISTS

WHERE ASSIGNMENT.BLDG_ID = BUILDING.BLDG_ID AND

ASSIGNMENT.WORKER_ID = WORKER.WORKER_ID)

Result:

WORKER ID

Note that the subquery starting at the fourth query line (II) is identical to the query (I) in which "1234" is replaced by WORKER.WORKER_ID. Query (II) can be read as follows:

Select WORKER_ID from WORKER if there is no building to which WORKER_ID is not assigned.

This matches the conditions of the original request.

We see that the NOT EXISTS operator can be used to formulate those queries that required the operation of division in relational algebra, and the universal quantifier in relational calculus. In terms of ease of use, the NOT EXISTS operator does not provide any special advantages, that is, SQL queries that use NOT EXISTS twice are no easier to understand than relational algebra solutions with a division operation or relational calculus solutions with a universal quantifier. More research is needed to create language constructs that allow for a more natural way to solve such queries.

Built-in Functions

Consider these types of questions:

What are the maximum and minimum hourly rates? What is the average number of days employees work in building 435? What is the total number of days for plastering work on Building 312? How many different specialties are there?

Answering these questions requires aggregate functions that look at multiple table rows and produce a single value. SQL has five such functions, called built-in functions or set functions. These are the SUM (sum), AVG (average), COUNT (count), MAX (maximum) and MIN (minimum) functions.

Inline function (set function). Statistical function that operates on multiple rows: SUM (sum), AVG (average), COUNT (count), MAX (maximum), MIN (minimum).

Inquiry: What are the maximum and minimum hourly rates?

SELECT MAX(HRLY_RATE), MIN(HRLY_RATE)

Result: 17.40, 8.20

The MAX and MIN functions operate on a single table column. They select the maximum or minimum value, respectively, from that column. Our query formulation does not contain a WHERE clause. For most requests, this may not be the case, as our next example shows.

Inquiry: What is the average number of days employees work in building 435?

SELECT AVG(NUM_DAYS)

WHERE BLDG_ID=435

Result: 12.33

Inquiry: What is the total number of days for plastering work on Building 312?

SELECT SUM(NUM_DAYS)

FROM ASSIGNMENT, WORKER

WHERE WORKER.WORKER_ID = ASSIGNMENT.WORKER_ID AND

SKILL_TYPE = "Plasterer" AND

Result: 27

The solution uses a join between the ASSIGNMENT and WORKER tables. This is required because SKILL_TYPE is in the WORKER table and BLDG_ID is in the ASSIGNMENT table.

Inquiry: How many different specialties are there?

SELECT COUNT (DISTINCT SKILL_TYPE)

Result: 4

Since the same specialty can be repeated in several different rows, the DISTINCT (different) keyword must be used in this query to prevent the system from counting the same specialty type more than once. The DISTINCT operator can be used with any of the built-in functions, although it is, of course, redundant with the MAX and MIN functions.

DISTINCT. An operator that excludes duplicate lines.

The SUM and AVG functions should only be used with numeric columns. Other functions can be used with both numeric and character data. All functions except COUNT can be used with calculated expressions. For example:

Inquiry: What is the average weekly salary?

SELECT AVG(40*HRLY_RATE)

Result: 509.14

COUNT can refer to the entire row, not just a single column :

Inquiry: How many buildings have quality level 3?

SELECT COUNT (*)

FROM BUILDING WHERE

Result: 3

As all these examples show, if there is a built-in function in the SELECT command, then nothing else can be in this SELECT command. The only exception to this rule is with the GROUP BY clause, which we'll look at in a moment.

GROUP BY and HAVING phrases

In management, statistical information about each group in a plurality of groups is often required. For example, consider the following query:

Inquiry: For each manager, find out the maximum hourly rate among his subordinates.

In order to solve this problem, we must divide workers into groups according to their managers. We will then determine the maximum bet within each group. In SQL it is done like this:

GROUP BY SUPV_ID

Result:

SUPV_IDMAX(HRLY RATE)

When processing this query, the system first divides the rows of the WORKER table into groups according to the following rule. Rows are placed in the same group if and only if they have the same SUPV_ID. The SELECT clause is then applied to each group. Since there is only one SUPV_ID value in this group, there is no SUPV_ID ambiguity in the group. For each group, the SELECT statement outputs the SUPV_ID and also calculates and outputs the MAX(HRLY_RATE) value. The result is presented above.

In a SELECT statement with built-in functions, only those columns that are included in the GROUP BY clause can occur. Note that SUPV_ID can be used in the SELECT statement because it is included in the GROUP BY clause.

Phrase GROUP BY. Indicates that rows should be split into groups with common values of the specified column(s).

The GROUP BY clause allows you to perform certain complex calculations. For example, we may want to find out the average of these maximum bids. However, calculations with built-in functions are restricted in the sense that the use of built-in functions inside other built-in functions is not allowed. So an expression like

AVG(MAX(HRLY_RATE))

forbidden. The implementation of such a request will consist of two stages. First we have to put the maximum bids in a new table, and in the second step calculate their average.

With the GROUP BY command, you can use the WHERE command:

Inquiry: For each type of building, find out the average level of quality among buildings of status 1.

SELECT TYPE, AVG(QLTY_LEVEL)

WHERE STATUS = 1

Result:

TYPEAVG(QLTY_LEVEL)

Shop 1

Residential building 3

The WHERE clause is executed before the GROUP BY command. Therefore, no group can contain a row that has a status other than 1. Status 1 rows are grouped by TYPE, and then a SELECT clause is applied to each group.

Phrase HAVING. Imposes conditions on groups.

We can also apply conditions to groups created by the GROUP BY clause. This is done with the HAVING clause. Suppose, for example, that we decide to flesh out one of the previous queries:

Inquiry: For each manager who has more than one subordinate, find out the maximum hourly rate among his subordinates.

We can reflect this condition with the appropriate HAVING command:

SELECT SUPV_ID, MAX(HRLY_RATE)

FROM WORKER GROUP BY SUPV_ID

HAVING COUNT(*) > 1

Result:

SUPV_ID MAX(HRLY_RATE)

The difference between WHERE and HAVING clauses is that WHERE applies to rows while HAVING applies to groups.

A query can contain both a WHERE clause and a HAVING clause. In this case, the WHERE clause is evaluated first, because it is evaluated before grouping. For example, consider the following modification of the earlier query:

Inquiry: For each type of building, find out the average quality level among buildings of status 1. Consider only those types of buildings whose maximum quality level does not exceed 3.

SELECT TYPE, AVG(QLTY_JLEVEL)

WHERE STATUS = 1

HAVING MAX(QLTY_LEVEL)<= 3

Result:

TYPEAVG(QLTY_LEVEL)

Shop 1

Residential building 3

Note that starting with the FROM clause, the phrases are executed in order, and then the SELECT clause is applied. For example, the WHERE clause is applied to the BUILDING table, and all rows in which STATUS is not 1 are deleted. The remaining rows are grouped by TYPE; all rows with the same TYPE value end up in the same group. This creates multiple groups, one for each TYPE value. The HAVING clause is then applied to each group, and those groups whose maximum quality level value is greater than 3 are removed. Finally, the SELECT clause is applied to the remaining groups.

7. Built-in functions and subqueries

Built-in functions can only be used in a SELECT clause or in a HAVING statement. However, a SELECT clause containing an inline function can be part of a subquery. Consider an example of such a subquery:

Inquiry: Which employee has an hourly rate above average?

SELECT WORKER_NAME

WHERE HRLY_RATE >

(SELECT AVG(HRLY_RATE)

Result:

H. Columbus

Note that the subquery does not correlate with the main query. The subquery returns exactly one value - the average hourly rate. The main query selects a worker only if its bid is greater than the computed average.

Correlated queries can also use built-in functions:

Query: Which of the employees has an hourly rate higher than the average hourly rate among the subordinates of the same manager?

In this case, instead of calculating one average hourly rate for all employees, we must calculate the average hourly rate for each group of employees reporting to the same manager. Moreover, our calculation must be done again for each worker considered by the main query:

SELECT A. WORKER_NAME

In this tutorial, you will learn how to use SQL EXIST condition with syntax and examples.

Description

The SQL EXISTS clause is used in conjunction with a subquery and is considered satisfied if the subquery returns at least one row. It can be used in a SELECT, INSERT, UPDATE, or DELETE statement.

Syntax

The syntax for the EXISTS condition in SQL is:

Options or Arguments

subquery subquery is a SELECT statement. If the subquery returns at least one record in its result set, the EXISTS clause evaluates to true and the EXISTS condition will be met. If the subquery does not return any records, the EXISTS clause evaluates to false and the EXISTS condition will fail.

Note

SQL statements using the EXISTS clause are very inefficient because the subquery is re-run for EVERY row in the outer query table. There are more efficient ways to write most queries that don't use the EXISTS clause.

Example - Using the EXISTS Condition with a SELECT Statement

Let's start with an example that shows how to use the EXISTS condition with a SELECT statement.

In this example, we have a customers table with the following data:

Now let's find all records from the customers table where there is at least one record in the orders table with the same customer_id . Run the following SELECT query:

4 entries will be selected. Here are the results you should get:

customer_id	first_name	last_name	favorite_website
4000	Justin	Bieber	google.com
5000	Selena	Gomez	bing.com
7000	Tom	Cruise	oracle.com
8000	Johnny	Depp	NULL

In this example, there are 4 records in the customers table that have a customer_id value in the orders table.

Example - Using the EXISTS Condition with an UPDATE Statement

Let's look at an example that uses the EXISTS clause in an UPDATE statement.
In this example, we have a products table with the following data:

Now let's update the summary_data table with the values from the products table. Enter the following SQL statement:

PgSQL

5 records will be updated. Select the data from the summary_data table again:

PgSQL

SELECT * FROM summary_data;

This example will update the current_category field in the summary_data table with the category_id from the products table where the product_id values match. The first 5 records in the summary_data table have been updated.

prompt: If we didn't include the EXISTS clause, the UPDATE query would update the current_category field to NULL in row 6 of the summary_data table (because the products table doesn't have an entry where product_id = 8).

Example - Using the EXISTS Condition with a DELETE Statement

Let's look at an example that uses the EXISTS condition in a DELETE statement.

In this example, we have a customer table with the following data:

Enter the following DELETE statement:

PgSQL

1 entry will be deleted. Select the data from the orders table again:

PgSQL

SELECT * FROM orders;

SELECT *FROM orders;

Here are the results you should get.