This topic describes factors that affect report results when a host row has no corresponding cross-referenced row. The discussion applies to both dynamic and embedded joins. |
In this section: Controlling Outer Join Optimization Missing Rows in Unique Descendants |
Normally, when a row from the host table or view is retrieved, a corresponding row from the cross-referenced table can also be retrieved. If a host row lacks a corresponding cross-referenced row, the retrieval path is called a short path. When there are short paths, the processing of the host row and the report results depend on:
Note:
Even if you set OPTIMIZATION ON, the data adapter may disable it. Use the SQLAGGR trace component to determine whether optimization is enabled for a particular request (Tracing Data Adapter Processing, describes trace facilities).
You can include short paths in a report with the FOCUS SET ALL command
SET ALL = {OFF|ON}
where:
Note:
With the SET SQLJOIN OUTER command you can control when the data adapter optimizes outer joins without affecting the optimization of other operations. This parameter provides backward compatibility with prior releases of the data adapter and enables you to fine-tune your applications. |
How to: Control Outer Join Optimization Example: Enabling Left Outer Join Optimization Reference: Effects of Combinations of Settings on Outer Join Optimization |
When join optimization is in effect, the data adapter generates one SQL SELECT statement that includes every table involved in the join. The RDBMS can then process the join. When join optimization is disabled, the data adapter generates a separate SQL SELECT statement for each table, and FOCUS processes the join.
You can use the SQLJOIN OUTER setting to disable outer join optimization while leaving other optimization enhancements in effect.
SQL sqlengine SET SQLJOIN OUTER {ON|OFF}
where:
Note:
(FOC1420) OPTIMIZATION OF ALL=ON AS LEFT JOIN - : OFF
The following table describes how different combinations of OPTIMIZATION and SQLJOIN OUTER settings affect data adapter behavior. It assumes that SET ALL = ON:
Settings |
Results |
||
OPTIMIZATION |
SQLJOIN OUTER |
Outer Join Optimized? |
Other Optimization Features |
ON |
ON |
Yes |
Enabled |
ON |
OFF |
No |
Enabled |
OFF |
N/A |
No |
Disabled |
SQL |
ON |
Yes, in all possible cases |
Enabled |
SQL |
OFF |
No |
Enabled |
FOCUS |
ON |
Yes if results are equivalent to FOCUS-managed request |
Enabled |
FOCUS |
OFF |
No |
Enabled |
The following request specifies a left outer join between the EMPINFO and FUNDTRAN tables. The SQLJOIN OUTER setting specifies that the left outer join should be optimized, and the SQLAGGR and STMTRACE trace components are activated:
SET TRACEUSER = ON
SET TRACEOFF = ALL
SET TRACEON = SQLAGGR//CLIENT
SET TRACEON = STMTRACE//CLIENT
SET ALL = ON
SQL DB2 SET OPTIMIZATION ON
SQL DB2 SET SQLJOIN OUTER ON
JOIN EMP_ID IN EMPINFO TO ALL WHO IN FUNDTRAN AS J1
TABLE FILE EMPINFO
SUM AVE.CURRENT_SALARY ED_HRS BY WHO BY LAST_NAME
IF DEPARTMENT EQ 'MIS'
END
The following trace is generated. One SELECT statement is generated. The LEFT OUTER JOIN phrase in the FROM predicate specifies the left outer join:
AGGREGATION DONE ...
SELECT T2."EID",T1."LN", AVG(T1."CSAL"), SUM(T1."OJT") FROM (
USER1."EMPINFO" T1 LEFT OUTER JOIN "USER1"."FUNDTRAN" T2 ON
T2."EID" = T1."EID") WHERE (T1."DPT" = 'MIS') GROUP BY
T2."EID",T1."LN" ORDER BY T2."EID",T1."LN" FOR FETCH ONLY;
In a unique join, the engine that handles the join controls the output:
|
Example: Reference: |
In this case the SQLJOIN OUTER OFF setting requires that FOCUS process the left outer join even though other optimization features remain in effect.
Note:
When FOCUS handles join processing, if you describe a join as unique when the cross-referenced table actually contains more than one matching record for a row in the host table, FOCUS displays only the first matching cross-referenced row on the report. Do not specify a unique join to FOCUS when the data structure is non-unique.
Since the RDBMS has no concept of a unique join, its behavior is identical whether the join is unique or non-unique. See Missing Rows in Non-unique Descendants for a complete discussion.
The following topics discuss each of these settings.
FOCUS handles the join when either of the following conditions is true:
FOCUS substitutes default values for any missing cross-referenced data (because it does not consider them to be missing in a unique join). FOCUS recognizes that you have defined the join to be unique and:
The Data Adapter Optimizer, discusses factors that determine whether optimization is disabled. To find out if and why the data adapter disabled OPTIMIZATION for a report request, use the SQLAGGR trace component (see Tracing Data Adapter Processing).
In the following example, a unique join connects the EMPINFO table, containing employee information, to the FUNDTRAN table, containing direct deposit account information for the employees. (File Descriptions and Tables, provides the Master and Access Files.)
join emp_id in empinfo tag file1 to who in fundtran tag file2 as j1
>
sql db2 set optimization off
>
table file empinfo
print bank_name bank_acct
by department by emp_id
end
NUMBER OF RECORDS IN TABLE= 14 LINES= 14
Since the join is unique and the SET OPTIMIZATION OFF command disables optimization, FOCUS displays one row per employee on the report. If there is no data for the cross-referenced table (FUNDTRAN), FOCUS appropriately substitutes zeros or blanks for its fields.
Two new employees, John Royce (333121200, no department) and Donna Lee (455670000, MIS) have no bank accounts. Six other employees also lack bank accounts:
DEPARTMENT EMP_ID BANK_NAME BANK_ACCT
---------- ------ --------- ---------
. 333121200 0
MIS 112847612 0
117593129 STATE 40950036
219984371 0
326179357 ASSOCIATED 122850108
455670000 0
543729165 0
818692173 BANK ASSOCIATION 163800144
PRODUCTION 071382660 0
119265415 0
119329144 BEST BANK 160633
123764317 ASSOCIATED 819000702
126724188 0
451123478 ASSOCIATED 136500120
Note: Employee 333121200 has not yet been assigned a department; its null value is indicated by the NODATA symbol. However, BANK_NAME and BANK_ACCT are not considered missing because the join is unique.
The RDBMS processes the join under the following conditions:
Since the RDBMS does not recognize the concept of a unique join, it performs an inner join just as it does if the join is non-unique. See Missing Rows in Non-unique Descendants for a complete discussion.
If SQLJOIN OUTER is ON, the RDBMS performs a left outer join just as it does if the join is non-unique. See Missing Rows in Non-unique Descendants for a complete discussion.
(If SQLJOIN OUTER is OFF, FOCUS processes the join. It substitutes default values for short path columns and displays only one matching record as described in How FOCUS Processes a Unique Join.)
The Data Adapter Optimizer, discusses factors that determine whether optimization is enabled.
In a non-unique join (JOIN...TO ALL) with missing cross-referenced rows, report results depend entirely on the SET ALL command because:
|
Example: Non-Unique Join Processing With SET ALL = OFF Non-Unique Join Processing With SET ALL = ON Reference: SET ALL=OFF With a Non-unique Join SET ALL=ON With a Non-unique Join |
With SET ALL=OFF, optimization may be enabled or disabled. In both cases, however, the report results are the same: an inner join. Host rows that lack corresponding cross-referenced rows are not included in the report; multiple matching rows are included (even if there is only one).
For each of the following examples, the same non-unique join is in effect. It connects the EMPINFO table, containing employee information, to the DEDUCT table, containing salary deduction information.
The examples also execute the same report request. OPTIMIZATION is set ON and only the SET ALL command changes.
Two employees, John Royce (333121200, no department) and Donna Lee (455670000, MIS) have not been paid yet; therefore, they have no deductions.
When SET ALL is OFF, a host row that lacks a corresponding cross-referenced row is rejected (regardless of whether FOCUS or the RDBMS processes the join):
join clear j1
join emp_id in empinfo tag file1 to all dedeid in deduct tag file2 as j2
set all=off
table file empinfo
print ded_code ded_amt
by department by emp_id by deddate
end
NUMBER OF RECORDS IN TABLE= 448 LINES= 448
The report displays multiple deduction records in the cross-referenced table for each of 12 long-time employees in the host table. John Royce (333121200, no department) and Donna Lee (455670000, MIS department) are not listed in the report, because their corresponding cross-referenced rows do not exist in the DEDUCT table:
DEPARTMENT EMP_ID DEDDATE DED_CODE DED_AMT
---------- ------ ------- -------- -------
MIS 112847612 82/01/29 CITY $1.43
SAVE $54.60
STAT $20.02
LIFE $13.65
HLTH $22.75
FICA $100.10
FED $121.55
82/02/26 STAT $20.02
SAVE $54.60
LIFE $13.65
.
.
.
With a non-unique join, SET ALL=ON produces a left outer join regardless of whether FOCUS or the RDBMS handles the join:
In both cases the report results are the same.
If there are no cross-referenced rows for a host row, the cross-referenced columns display the NODATA value.
The following example executes the same report request as in Non-Unique Join Processing With SET ALL = OFF, except that SET ALL is ON:
set all = on
table file empinfo
print ded_code ded_amt
by department by emp_id by deddate
end
NUMBER OF RECORDS IN TABLE= 450 LINES= 450
John Royce (333121200) is now on Page 1 and Donna Lee (455670000) on Page 8. The department column for John Royce (333121200) displays the NODATA value, since the field has MISSING=ON:
PAGE 1
DEPARTMENT EMP_ID DEDDATE DED_CODE DED_AMT
---------- ------ ------- -------- -------
. 333121200 . . .
MIS 112847612 82/01/29 CITY $1.43
FED $121.55
FICA $100.10
HLTH $22.75
LIFE $13.65
.
.
.
Donna Lee (455670000) works for MIS:
PAGE 8
DEPARTMENT EMP_ID DEDDATE DED_CODE DED_AMT
---------- ------ ------- -------- -------
MIS 326179357 82/07/30 STAT $88.93
82/08/31 CITY $6.35
FED $539.96
FICA $444.67
HLTH $45.37
LIFE $27.22
SAVE $108.90
STAT $88.93
455670000 . . .
543729165 82/04/30 CITY $.72
.
.
.
Note: The report that results from passing a request to the RDBMS with SET ALL=ON may be sorted in a slightly different order from the report produced if FOCUS processes the join; however, both results are equally correct.
When SET ALL=ON, screening conditions affect report results for a non-unique join. If a screening test specifies a field from the cross-referenced structure, host rows whose cross-referenced rows were screened out are not represented, regardless of the SET ALL command.
The following example includes a WHERE test to screen for health deduction records:
join emp_id in empinfo tag f1 to all dedeid in deduct tag f2 as join1
set all=on
table file empinfo
sum ded_amt
by department by last_name by first_name by ded_code
where ded_code eq 'hlth'
end
Royce and Lee are omitted as the result of the WHERE test, because they have no deduction code records:
DEPARTMENT LAST_NAME FIRST_NAME DED_CODE DED_AMT
---------- --------- ---------- -------- -------
MIS BLACKWOOD ROSEMARIE HLTH $226.87
CROSS BARBARA HLTH $330.21
JONES DIANE HLTH $121.16
MCCOY JOHN HLTH $16.50
SMITH MARY HLTH $181.99
PRODUCTION BANNING JOHN HLTH $41.25
IRVING JOAN HLTH $427.35
MCKNIGHT ROGER HLTH $122.43
ROMANS ANTHONY HLTH $105.60
STEVENS ALFRED HLTH $69.44
Even if SET ALL=OFF, you can apply the effect of the ON setting to specific tables. To do this, add the ALL. prefix to one of the host fields in the request. The ALL. prefix causes FOCUS to process host rows even if they have missing cross-referenced rows. Like SET ALL=ON, the report results depend on whether screening tests exist for the cross-referenced fields.
For example, when the ALL. prefix is applied to the field DEPARTMENT, the report results are the same as for SET ALL=ON. (The report is displayed in Non-Unique Join Processing With SET ALL = ON).
set all=off
table file empinfo
print ded_code ded_amt
by all.department by emp_id by deddate
end
Note: The ALL. prefix is only effective when the SET ALL value is OFF; SET ALL=ON overrides the ALL. prefix.
This summary chart lists possible report results for dynamic joins and multi-table Master Files (embedded joins):
|
RDBMS Behavior |
FOCUS Behavior |
||
|
SET ALL = |
SET ALL = |
||
Join Type |
ON (SQLJOIN OUTER ON) |
OFF |
ON |
OFF |
NON-UNIQUE
Dynamic |
Outer Join |
Inner Join |
Outer join (Also applies if Optimization Enabled and SQLJOIN OUTER OFF) |
Inner join |
Short paths |
Appear with NODATA (.) value for fields of all missing segments |
Do not appear |
Appear with NODATA (.) value for fields of all missing segments |
Do not appear |
More than one matching row |
All matching rows appear |
All matching rows appear |
All matching rows appear |
All matching rows appear |
UNIQUE
Dynamic |
Outer join |
Inner Join |
FOCUS Unique (Also applies if Optimization Enabled and SQLJOIN OUTER OFF) |
FOCUS Unique |
Short paths |
Appear with NODATA (.) value for fields of all missing segments |
Do not appear |
Appear with blank or zero for missing values |
Appear with blank or zero for missing values |
More than one matching row |
All matching rows appear |
All matching rows appear |
Only first matching row appears |
Only first matching row appears |
Note:
Information Builders |