What Are Datapoints?

A datapoint is similar to a field in a table, but with built-in dimensional linkages. It also has linkages back to sources and forward to measures, which provide a clear lineage from harvest point to presentation.

Procedure: How to Edit Loadable, Generated, or User Entered Datapoints

Note: Most of the information for Loadable, Generated, or User-Entered datapoints are read-only. The only think you can change about them is their name or description.

In the Manage tab, click the Datapoints panel button.
Expand the Loaded Datapoints, Generated Datapoints, or User Entered Datapoints folder and select the datapoint you want to view.
The Edit Datapoint panel opens.
Edit the Name or Description of the loadable datapoint.
Click Save.

Derived Datapoints

In this section:

Creating Calculated Measures With Derived Datapoints

How to:

Reference:

Derived datapoints let you create calculations that include dimensional metadata. For example, you can create a series of derived datapoints that perform a series of calculations on Sales performance for your manufacturing company:

Cost of Supply (by Product, Location and Time)
Cost of Labor (by Product, Location and Time)
Cost of Warehousing/Storage (by Product, Location and Time)
Cost to Ship (by Product, Location and Time)

These datapoints can now be added up to become Total Cost (by Product, Location and Time).

You can then set Total Cost against your Sales (by Product, Location and Time) to calculate Profit.

You can also load precalculated Total Costs and Profit datapoints from an external Source, but there is no guarantee the data will be calculated in the proper order. If you use derived datapoints to calculate the values:

The data will be calculated in the correct order, by using the generations in the datapoint lineage. PMF will also recognize incomplete data and handle it accordingly.
You will be able to deconstruct the calculations performed for all derived datapoints using Lineage Chains.
Lineage Chains are currently available in the Lineage tabs on dimensions, sources, datapoints, and measures panels.
Derived datapoints allow deep models of recalculation, that let many measures share the same common root calculated values.
Derived datapoints let you mix and match any data source in one contiguous data mart, and are much easier to set up than ETL jobs. This is because dimensional aggregation logic is included in the calculations, so you do not have to write complex dimensional logic in an ETL tool.

Top of page

Procedure: How to Create a Derived Datapoint

To create a derived datapoint:

In the Manage tab, click the Datapoints panel button.
Click New.
The New Datapoint panel opens.
Name the new derived datapoint.
Drag the datapoints you need for your calculation into the canvas. Each datapoint must be separated by its operation, as shown in the following image.

Calculations can also include constants. To add a constant, drag the Constant object into position on the canvas, and type in the constant value inside the Constant object.

Separate datapoints for WebFOCUS functions are typically created during the source load, since capturing these calculations is done best in the first-generation in the lineage, during harvesting.

For example, if you want to capture counts of a particular condition, rather than trying to save all those attributes somewhere so you can perform the filtering later, you can determine When, that is what filters should be true, for the count. You can then pull that count into a loadable datapoint. Approaching data this way allows you to make calculations in the lineage after this harvesting phase simpler for you to manage.
Click Save. If the calculation is not complete, PMF recognizes this, and marks the derived datapoint as Incomplete. Incomplete derived datapoints do not participate in recalculation.

Top of page

Procedure: How to Change Datapoints

To change a derived datapoint:

In the Manage tab, click the Datapoints panel button.
Select the derived datapoint you want to change. The Edit Datapoint panel opens.
Make your desired edits. You can change anything in a derived datapoint, including the name and its formula.
Note:
- Datapoints are included in formulas and linked to measures by reference, so renaming them changes their name through the entire system.
- Altering the formula for a derived datapoint automatically flags its data, and any later generations in the lineage for that datapoint, including child derived datapoints and measure values, for a one-time wipe. If the data is also scheduled for reload, PMF performs that load after wiping the data.

Top of page

Procedure: How to Copy Derived Datapoints

You can make an exact copy of any existing derived datapoint. After making the copy, you can immediately alter it as needed. To copy a datapoint:

From the Manage tab, click the Datapoints panel button.
Select the derived datapoint you want to copy. The Edit Datapoint panel opens.
Click Save As. You will be prompted for a new name for the derived datapoint, as shown in the following image.
Click Save. PMF will make an exact copy of the derived datapoint. You can edit and save your changes at any time, and click Save As again if you want to make more copies. This datapoint is what will be loaded for editing.

Top of page

Procedure: How to Wipe Derived Datapoint Data

All loaded data from a derived datapoint can be wiped out or deleted in a single operation, because they are not attached to a source.

Note: Wiping data affects downstream datapoints for the datapoint you wipe. Every datapoint downstream is marked as having incomplete components. Incomplete components do not participate in recalculation.

From the Manage tab, click the Datapoints panel button.
Select the datapoint that needs to be deleted. The Edit Datapoint panel opens.
Click the Wipe Data button. PMF will ask you to confirm the data purge.
Click OK.

Note: It may take PMF a moment to purge all of the data.

Top of page

Creating Calculated Measures With Derived Datapoints

PMF allows you to create an unlimited number of calculations for your measures using special datapoints that store and process calculations, known as derived datapoints. These calculations can be based on one or more existing datapoints, of any kind, including loadable, user-entered, generated, and other derived datapoints. Note the following:

If you create a derived datapoint that uses data from loadable, user-entered, or generated datapoints, PMF will recalculate the results every time the data for these are changed. The data goes through the lineage, through all of your steps of calculation, until it is copied to any measures linked to your datapoints.
If you create a derived datapoint that uses data from another derived datapoint, PMF knows that the “parent” derived datapoint must be calculated before calculating your new derived datapoint. Logic built into PMF understands that calculations must use generations in the datapoint lineage.
Recalculating a complex lineage chain through possibly hundreds of thousands or millions of row values can be an expensive operation, so you have full control over how much of this calculation is performed during normal processing hours.

Note: During scheduled load cycles, since PMF is less used during scheduled load times (usually overnight), recalculation can always go through the entire lineage.

Top of page

Reference: Previewing Derived Datapoints

You can preview the data that PMF will generate by clicking the Preview tab, as shown in the following image.

Preview tab

The Preview tab generally shows rows that are new, or will be updated or deleted.

Tips:

The Preview tab shows data to be handled before any operations occur. It divides this data up into the following sections, based on what will happen to the rows that are displayed:
- New rows to be created by generation.
- Rows to be updated by generation.
- Rows to be deleted by generation (depending on the Wipe Data setting on the Advanced tab).
- Rows that will be kept but whose values do not match the new data to be generated (depending on the Wipe Data setting on the Advanced tab).
Opening the Preview tab will force the navigation bar to close, in order to use as much screen width as possible. You can reopen it by clicking the expand button at the top of the navigation bar.
You can resort the preview contents in any order by clicking the column headings. Note that the Preview will show data based on the display limit set in the Load settings. To change this value, see Load Settings.

Top of page

Reference: Lineage and Recalculation With Derived Datapoints

Derived datapoints can have a complex, multi-part lineage, depending on their relationship to other derived datapoints.

In the lineage directionality of derived datapoints, the data in derived datapoints always progresses to the left, from first-generation datapoints (loadable, user-entered, and generated) toward measures.
PMF automatically handles figuring out the generation of each derived datapoint in the lineage, by analyzing the first point in the lineage where a derived datapoint sends its data onward.

Top of page

Reference: Derived Datapoint Lineage Tab

You can view lineage for all datapoints for any derived datapoint. Lineage shows the progress of data through PMF, from the external data harvested into datapoints, through any derived datapoints, and finally all terminal points in Measures. The Lineage tab displays the components in the generated source by default, as shown in the following image.

The lineage tab automatically displays the entire lineage. You can click the Collapse All button to hide the entire lineage.

Top of page

Reference: Derived Datapoint Load History

PMF keeps track of each load that is executed for each derived datapoint in the system, regardless of whether you loaded it manually or the load was called by the scheduler. This data is stored in a special logging section of the PMF data mart.

The History tab on each derived datapoint displays the history of all loads that have been logged.

The history of the derived datapoint shows:

The dates that the loads ran.
The count of rows that were retrieved, inserted, updated, and deleted.
The count of total mismatches that occurred between the source data and the PMF metrics mart. Mismatches are source data rows that do not match to any existing keys for one or more dimensions.
The count of gaps in data continuity, which indicate the sparsity of the data. This does not mean there are errors but, if paired with mismatches, can help you debug any unexpected data discontinuities.
Any messages returned from the load system. If there is an error, the exact error is displayed in the information shown in this tab.

Loadable Datapoints

Loadable Sources manage Loadable datapoints. You can drill into any datapoint on a Loadable Source to view the specifics about the datapoint, or you can access Loadable datapoints from the separate panel button for them.

Generated Datapoints

In this section:

Promoting a Generated Datapoint

How to:

Create a Generated Datapoint

Reference:

Lineage and Recalculation With Generated Datapoints

Generated datapoints enable PMF to create sample data for your models. With generated datapoints, you can:

Tell PMF the maximum and minimum values to generate.
Specify which dimensional intersections should contain the generated data.
Use different sampling methods to generate the data.

Generated datapoints are designed for the following situations:

When you need to demonstrate metrics in dashboards, but have nothing but a rough idea what the data should look like.
When a sponsor can give you more specific guidelines as to the data they want to see, but you do not want to spend hours modeling the data in a tool.
When you are creating a new metrics model, and want to spend your time on it, rather than on the data.

Important: Generated data should never be treated as real performance data. PMF 5.3.2 does not yet mark generated data as “unreal,” so use generated datapoints only for non-production work.

Top of page

Procedure: How to Create a Generated Datapoint

To create a generated datapoint:

In the Manage tab, click the Datapoints panel button.
Click New.
The New Datapoint panel opens.
Select Generated from the first drop-down menu.
Name the new datapoint.
Click the Dimensions tab and specify the dimensions and levels for which PMF will generate data, as shown in the following image.

Setting dimensions affects some options on the Rules tab, so if you know the dimensions you want to use for generating, set them first.
Click the Rules tab and specify the rules PMF should use to generate data. The following options are available:
Decimal Format
Specifies the decimal format of the data generated:
- The first character can be D (Decimal) or I (Integer).
- The next characters are numbers to specify the total length of the field.
- You can indicate a period and number of digits of decimal precision.
Examples of typical decimal formats are: D12.2, I8, D20.6 and, I32.
Method
Controls how PMF will calculate the sample values:
- Normal (Bell Curve) Distribution. PMF generates a range of values that favors the center of the numeric range you type in under Lower/Upper Bounds.
- Uniform Random Distribution. PMF generates an even distribution of values that favors no point in the numeric range.
Lower/Upper Bounds

The lowest and highest number for the range of possible values PMF will generate. The numbers will be formatted using the mask you entered in the Decimal Format field.

Data Sparsity
Controls the amount of data PMF generates by letting you focus the data on dimensional choices:
- None. Generates a Cartesian cross-product of all possible dimension values.
- Dimensional Filters. You can specify filters for the dimension levels for the generated datapoint. To specify the filters, select this option and use the drop-down menus, as shown in the following image.
- Train. You can base the dimension level values along which PMF generates data on another datapoint. This lets you keep a limited amount of data together.
  You can specify any datapoint to train from a loadable datapoint, user-entered datapoint, derived datapoint, or another generated datapoint.
Recalculate all Derived Datapoints

This option should remain enabled, unless you have a very large data mart and want to reserve recalculation for overnight or other offline processing.

Note: This option enabled by default. To disable it, see Load Settings.

Description

A description of the datapoint.
Click Save. If minimum necessary entries are not set up to generate data, PMF will mark the generated datapoint as incomplete. Incomplete components do not participate in recalculation.

Tips:

When generating random data for generated datapoints, PMF will wipe all existing data before regenerating it using your new rules. Generally, with generated datapoints, the Preview tab is useful for new data you are planning to generate into an empty datapoint, or when changing rules. If your rules have not changed, and data is showing up on Preview as 100% added and deleted, you should not regenerate data.
If you are modeling a new metrics system without having any real data to work from, you will have no data loaded at all into PMF to train from. In this situation, set the Dimensional Filters option on your first generated datapoint, specify the dimensions, and have PMF generate the data. You can then train loading all of your generated datapoints to load based on that one datapoint.

Top of page

Reference: Lineage and Recalculation With Generated Datapoints

Generated Datapoints are primary sources for data in PMF. They are treated as first generation in any lineage, along with loadable datapoints and user-entered datapoints.

Top of page

Promoting a Generated Datapoint

Generated datapoints are used when you lack real data to prove that your model works, or they are needed for a demonstration. Once you are ready to use a working model with real data, generated datapoints are no longer necessary to feed your metrics.

To promote the generated datapoint:

Add it to a loadable source so that its data can be harvested from an existing file, table, or view.
Add it to a user-entered source, so that the data can be collected from end users.