What Are Sources?

In this section:

How to:

Reference:

A source allows you map inbound data to PMF elements, called datapoints, that will become elements in measures, such as Actual, Target, and so on.

There are currently three types of sources available:

Note:


Procedure: How to Wipe Source Data

All loaded data from a loadable, user-entered, or generated source can be wiped out or deleted in a single operation. This is useful when you have loaded data that is invalid. It is a simple way to delete all the data.

  1. From the Manage page, click the Sources panel button.
  2. Select the source that needs data to be deleted. The Edit Source panel opens.
  3. Click the Wipe Data button. PMF will ask you to confirm the data purge.
  4. Click OK.

Note: It may take PMF a moment to purge all of the data.



Reference: Lineage Tab

You can view the lineage for all datapoints for any loadable, user-entered, or generated source by selecting the Lineage tab in the New Source or Edit Source panel. Lineage shows the progress of data through PMF, starting from the external data harvested into datapoints, through any derived datapoints, and finally, all terminal points in measures.

The following image shows an example of the Lineage tab for a loadable source.

Lineage tab

The lineage tab automatically displays the entire lineage. You can click the Collapse All button to hide the entire lineage.



Reference: Load History of Sources

PMF keeps track of each load that is executed for each source in the system, regardless of whether you loaded it manually or the load was called by the scheduler. This data is stored in a special logging section of the PMF data mart.

The History tab on each loadable, user entered, or generated source displays the history of all loads that have been logged, as shown in the following image.

History tab

The history of the source shows:

  • The dates that the loads ran.
  • The count of rows that were retrieved, inserted, updated, and deleted.
  • The count of total mismatches that occurred between the source data and the PMF metrics mart. Mismatches are source data rows that do not match to any existing keys for one or more dimensions.

    The total count of mismatches for generated sources should always be zero.

  • The count of gaps in data continuity, which indicate the sparsity of the data. This does not mean that there are errors, but, if paired with mismatches, this can help you debug any unexpected data discontinuities.

    The total count of gaps in data discontinuity for generated sources should always be zero.

  • Any messages returned from the load system. If there is an error, the exact error is displayed in the information shown in this tab.

    Errors are unlikely for generated sources.



Working With Loadable Sources

How to:

Reference:

Loadable sources in PMF:



Reference: Harvesting Loadable Sources

Data harvesting is the process of taking data from a source and processing it into information that is then loaded into one or more datapoints needed by the PMF source.

Note: The result of any data harvesting operation is always numeric.

Typical data harvesting actions are:

  • Summarizing aggregated values of an inbound numeric field.
  • Summarizing values of an inbound numeric field, but only if certain conditions are true. Otherwise, they are ignored.
  • Counting occurrences of a particular set of field conditions (Counting when).
  • Distinctly counting occurrences of a particular set of field conditions. Distinct counting is a method to prevent double-counting fixed assets, people, and so on.
  • Writing a custom load for a datapoint, where the logic is freeform. For example, it is possible to perform custom operations or calculations on the data inbound or implement more complex or nested filters. These use WebFOCUS programming logic, functions, and operations. The PMF source loader allows all the data harvesting methods above.

In PMF, all data harvesting is done across a potential Cartesian cross product of dimensional intersection. Nearly every load involves a fairly high degree of sparsity against this cross product, but in most cases, it also involves creating multiple records that follow a particular pattern against the source data.

In many cases, depending on the degree of granularity of the inbound loadable data from the source, PMF requests the data to be aggregated, even when loaded at the lowest level of its dimensionality. This happens when the physical data table or view contains more detailed records that go to a lower logical level than the source requests. Some examples of this are:

  • If the physical data of the source is stored at a daily or hourly level, and the source load needs it summarized at a monthly level.
  • If there is a location dimension, and the source stores data at the geocode or ZIP code level, and the PMF source wants this data at the city or regional level.

In the examples listed, the PMF loader will be aggregating records from the source inbound during the load process.



Reference: Lineage and Recalculation With Loadable Sources

Loadable Sources are primary sources for data in PMF. PMF treats them as first generation in any lineage, along with user-entered datapoints and generated datapoints.

Loadable data is treated as updated on the date of load, but is effective as of the time dimension linkages for the data.



Procedure: How to Set Up Loadable Sources and Datapoints

To set up a new Loadable source:

  1. In the Manage page, click the Sources panel button.
  2. Click New.

    The New Source panel opens.

  3. Name your new source and select Harvested from Data from the first drop-down menu, as shown in the following image. This option lets PMF know that you want this source to harvest data from an existing table, view, flat file, and so on.
    New Source panel
  4. Select the metadata file for the existing table from the second drop-down menu. If the file you want to use is not listed, expand the File Picker by selecting Show More... from the drop-down menu.

    The File Picker allows you to look at all of the available metadata files in your currently configured WebFOCUS app path. The icons to the left of each file name let you explore the contents of the file to make sure that it is the correct one. To select the file to use, click the name once and close the File Picker.

  5. Specify the datapoints to be harvested. For each line where you want to add another field from the source, you can do the following:
    • To harvest by summarizing the values of the field, select the field you want to harvest from the Source field name drop-down menu. You can use the default name assigned to the datapoint, or change the name at anytime.
    • To harvest by summarizing the values of the field only when conditions are matched, select the field you want to harvest from the Source field name drop-down menu and create filters by clicking the Filters button to choose your filters.
    • To harvest by counting, select [Count] from the Source Field Name drop-down menu and specify the conditions to count by clicking the When button. You are required to provide a name for the datapoint.
    • To harvest by counting distinctly, select [Distinct Count] from the Source Field Name drop-down menu. Specify the conditions to count by clicking the When button, and specify the matching field for distinct counting. You are required to provide a name for the datapoint.
    • To create a custom formula for calculating the field during harvesting, select [User Defined]. You are required to provide a name for the datapoint. Specify filters to be used by clicking the Filters button and specify the WebFOCUS code-based formula for harvesting the field by clicking the Code button.

      PMF creates the following template code:

      SRC_DATA_COLnn/D20.2 MISSING ON = MISSING ;

      where:

      nn is the one up datapoint number within the Source. The left side of the code should not be changed. The right side must be syntactically correct FOCUS code.

      Note: If you are coding an advanced Global Filter, you must use the WHERE verb:
      WHERE (M_NAME EQ 'COGS')

    Your datapoints should look similar to the following image.

    Datapoints

    Note: You can save your work-in-progress at any time. PMF will not be able to actually harvest data into the datapoints that were set up until the needed harvesting details and dimensionality have been specified.

  6. Click the Dimension Links tab and specify Time, as well as other dimensional linkages, as shown in the following image.
    Dimension Links tab

    Note: The fields you select for dimensional linkages must contain values that match up to those you loaded for the dimension keys in the Dimension Loader. PMF will alert you if there is an issue.

  7. Once the dimensional linkages are set up, click the Preview tab. The data that is being set up for harvesting can be seen.
    Note:
    • The Preview pane is very flexible and shows the row values that will be added, changed, or deleted from your source, in separate sections.
    • You can resort the contents of the Preview pane by clicking a column heading. The sort toggles between ascending and descending order.
    • You can scroll the Preview pane vertically and horizontally, and drag the navigation bar divider to make more room onscreen.
    • You can refresh the Preview pane at any time by clicking the Refresh button on the Preview tab.
    • You can switch back and forth between the Datapoints, Dimension Links, and Preview tabs to alter the patterns you are using for harvesting, adjust the linkage keys and dimensions that you are linking to, and refresh the preview, as needed.
  8. Click Save before concluding your session to ensure that the datapoints and source you have set up are correctly stored in the system. After saving the data, you can click Load to harvest initial data for your new source.


Procedure: How to Update a Loadable Source

To update a loadable source:

  1. From the Manage page, click the Sources panel button.
  2. Select the source you want to update. The Edit Source panel opens.
  3. Make the desired changes. You can rename the datapoints, change their field linkages, format, filters, when conditions, or code. You can also revise dimensional linkages and change Advanced properties of the source.
  4. Click Save when you are done making edits. PMF will perform the actions on the source and/or save the changes into each datapoint for the source.


Procedure: How to Change the File, Table, or View Used for a Loadable Source

If the name of the Master File or other physical connection information used by a loadable source changes, you can adapt those changes into the source in PMF.

  1. From the Manage page, click the Sources panel button.
  2. Select the source you want to edit. The Edit Source panel opens.
  3. Select the new Master File for the new data source you want to use from the drop-down menu. PMF will automatically clear the fields from the data source, so you can choose the correct new ones from your new source.
  4. Click Save to save your changes.


Working With User-Entered Sources

In this section:

How to:

Reference:

User-entered sources in PMF:

User-entered sources let you collect data from groups of end users.



Procedure: How to Create a User-Entered Source

  1. From the Manage page, click the Sources panel button.
  2. Click New.

    The New Source panel opens.

  3. Name the source, and from the first drop-down menu, select Collected from Users. This lets PMF know that the source will harvest data from the user-entered source.
  4. Define each user-entered datapoint for this source by entering a name and format for each datapoint. You can quickly enter this information by pressing the Tab key to move from field to field. You can create multiple user-entered sources, and each user-entered source can represent all the datapoints you would collect from a particular user population.

    For example, if you are collecting input from HR staff for HR metrics, you could create a user-entered source called HR Input, and in that source, you could define the datapoints to be collected from that group.

    For each datapoint you define, you can also define the numeric validation format. The PMF input facility will enforce that data format during collection, as shown in the following image.

    Managed Datapoints tab

    Note: You can save your work and leave the session at any time. If all the steps are not completed, the Source panel will mark this source as incomplete. End users will not be able to enter data until the source is complete, and incomplete components do not participate in recalculation.

  5. Click the Dimension Links tab and define the dimensionality to be collected for the user-entered datapoint. Select the dimensions and levels to link for each, as shown in the following image.
    Dimension Links tab
  6. Click Save.
  7. Click the Enter Data tab to view the data structure or input data, as shown in the following image.
    Enter Data tab

    If data entry is already displayed, click the Refresh button to refresh the contents in the Enter Data tab.

    In this tab, PMF displays the rows that are ready to receive data from input and allows you to enter data.

    • You can sort the contents of the Enter Data tab by clicking the column headings. The arrow next to the heading tells you the direction of the sort.
    • You can click on any row or column to highlight it and get a more detailed view.
    • You can resize the columns by dragging its borders in the headings.

    Click the small Save button within the Data Entry tab to save the data you entered into the individual datapoints on the Data Entry tab.

Note:
  • If you leave any open boxes blank, PMF will not create an intersection for them. That intersection will be treated as a NULL (for example, MISSING).
  • Zeros entered will be treated as entered data, and an intersection will be created.
  • If you blank out an entered number, its intersection will be deleted when you click Save.


Reference: Lineage and Recalculation With User-Entered Sources

User-entered data differs from standard loaded data in the following ways:

  • It is dependent on users supplying their data. If users fail to supply their data, no data will be available for the datapoints, or downstream, to be used in any calculations for derived datapoints, or for copying into measures.
  • The timing of the data entry is unpredictable. It is therefore difficult to predict when the automatic population of the data should be done.

    As a result, you need to schedule recalculation more frequently.

Note: User-entered data is treated as updated on the date of entry, and the downstream datapoints and measure copies are treated as loaded on the day they were scheduled to update.



Updating User-Entered Sources

You can update user-entered sources at any time. PMF adjusts existing user-entered data for the datapoints in a user-entered source as follows:

  • Changing the name for any datapoint causes the datapoint to be renamed, with no side effects.
  • Adding new datapoints causes them to be added to the source, and to take on the dimensional linkages already configured for that source.
  • Changing the format for any datapoint has no effect on the underlying data, because PMF stores all numbers internally in floating-point format. If you change the numeric format to remove or shorten the mantissa (for example, changing the format from D12.2 to D12) when the data is re-displayed for entry, PMF will automatically truncate the mantissa when the data is saved. This might have an effect on values shown for measures on various views and charts in PMF.
  • Adding dimensional linkages causes PMF to wipe the data. This is because PMF does not know how to reallocate data that has already been entered.
  • If the dimensional level is changed to a lower level for any linked dimension, it causes PMF to wipe the user-entered data for the datapoints. This is because PMF does not know how to reallocate the rolled-up data. The individual dimension-linked values will be lost.
  • If the dimensional level is changed to a higher level, PMF attempts to roll the data up. Once the rollup is completed, changing the level back to a lower one will cause PMF to wipe the data, for the same reason as above.


Working With Generated Sources

In this section:

How to:

Reference:

Generated sources in PMF enable you to:

Generated sources enable PMF to automatically create sample data for your models. They should be used in the following situations:

Note: Data that is generated should not be treated as real performance data. PMF cannot distinguish between generated and performance data, so use generated sources only for non-production work.



Procedure: How to Create a Generated Source

  1. From the Manage page, click the Sources panel button.
  2. Click New.

    The New Source panel opens.

  3. Name the source, and from the first drop-down menu, select Generated (sample data). This indicates to PMF that the source will be generated automatically.
  4. Define each datapoint that you want to create. As you enter each datapoint, PMF will set default rules to be used between the generated source and datapoint. You can quickly enter this information by pressing the Tab key to move from field to field.

    If you choose to specify the rules that PMF should use to generate data, the following options are available:

    Decimal Format

    Specifies the decimal format of the data generated:

    • The first character can be D (Decimal) or I (Integer).
    • The next characters are numbers to specify the total length of the field.
    • You can indicate a period and number of digits of decimal precision.

    Examples of typical decimal formats are: D12.2, I8, D20.6, and I32.

    Method

    Controls how PMF will calculate the sample values:

    • Normal (Bell Curve) Distribution. PMF generates a range of values that favors the center of the numeric range you type for Lower/Upper Bounds.
    • Uniform Random Distribution. PMF generates an even distribution of values that favors no point in the numeric range.
    Lower/Upper Bounds

    The lowest and highest number for the range of possible values PMF will generate. The numbers will be formatted using the mask you entered in the Decimal Format field.

    Note: You can save your work and leave the session at any time. If minimum necessary entries are not set up to generate data, PMF will mark the generated source as Incomplete. Incomplete sources and their datapoints do not participate in recalculation. You need to complete the source to allow its datapoints to participate, or you will not be able to get the data published using measures.

  5. Click the Dimension Links tab and define the dimensions and levels for which PMF will generate data. This will affect some options in the Rules tab, so perform this step next if you know the dimensions you want to use for generating, as shown in the following image.

    Note: If you are setting up a trained generated source, you do not have to specify dimension links because they will be inherited from the source.

    Data Sparsity

    The following option is available:

    Data Sparsity

    Controls the amount of data PMF generates by letting you focus the data on dimensional choices:

    • None. Generates a Cartesian cross-product of all possible dimension values.
    • Dimensional Filters. You can specify filters for the dimension levels for the generated datapoint. To specify the filters, select this option and use the drop-down menus.
    • Train. You can base the dimension level values along which PMF generates data on another datapoint. This lets you keep a limited amount of data together.

      You can specify any datapoint to train from a loadable datapoint, user-entered datapoint, derived datapoint, or another generated datapoint.

  6. Click Save.

Tip: When generating random data for generated datapoints, PMF will wipe all existing data before regenerating it using your new rules. Generally, with generated datapoints, the Preview tab is useful for new data you are planning to generate into an empty datapoint, or when changing rules. If your rules have not changed, and data is showing up on Preview as 100% added and deleted, you should not regenerate data.



Procedure: How to Update a Generated Source

To update a generated source:

  1. From the Manage page, click the Sources panel button.
  2. Select the source you want to update. The Edit Source panel opens.
  3. Make the desired changes. You can rename or delete datapoints, change the generation method or values for any datapoint, or delete the entire generated source.
    Note:
    • Datapoints are included in formulas and linked to measures by reference. Renaming them will percolate through the entire system.
    • Altering the generation method formula or the range values for a generated source automatically flags its data, and any later generations in the lineage for that Generated Source (including child-derived datapoints and measure values) for a one-time wipe. If the data is also scheduled for reload, PMF performs that load after wiping the data.
  4. Click Save when you are done making edits. PMF will perform the actions on the source and/or save the changes into each datapoint for the source.


Reference: Lineage and Recalculation With Generated Sources

Generated sources are primary sources for data in PMF. As such, PMF treat these source types as first generation in any lineage, along with loadable and user-entered datapoints.



Promoting a Generated Source

Generated sources are used when you lack real data to prove that your model works, or they are needed for a demonstration. Once you are ready to use a working model with real data, generated sources are no longer necessary to feed your metrics.

To promote the generated source:

  • Add it as a loadable source so that its data can be harvested from an existing file, table, or view.
  • Add it to a user-entered source, so that the data can be collected from end users.


Reference: Previewing Generated Data

You can preview the data that PMF will generate by clicking the Preview tab.

The Preview tab generally shows rows that are new, or will be updated or deleted.

Tips:
  • The Preview tab shows data to be handled before any operations occur. It divides this data up into the following sections, based on what will happen to the rows that are displayed:
    • New rows to be created by generation.
    • Rows to be updated by generation.
    • Rows to be deleted by generation (depending on the Wipe Data setting on the Advanced tab).
    • Rows that will be kept, but whose values do not match the new data to be generated (depending on the Wipe Data setting on the Advanced tab).
  • Opening the Preview tab will force the navigation bar to close, in order to use as much screen width as possible. You can reopen it by clicking the expand button at the top of the navigation bar.
  • You can resort the preview contents in any order by clicking the column headings. Note that the Preview will show data based on the display limit set in the Load settings. To change this value, see Load Settings.


Load Now Panel

How to:

The Load Now panel lets you automatically refresh the entire PMF cube from Source data. It runs the following:

Note: If any Dimensions or Sources are incomplete, the Load Now operation will fail. But this is one way to quickly show you what Dimensions, Sources, Derived Datapoints, or Metrics you still need to complete.



Procedure: How to Perform a Load Now Operation

To perform a Load Now operation:

  1. From the Manage page, click the Data Mart subtab.
  2. Click the Load Now panel button. The Load Now panel opens.
  3. Click the Load Now button.

    A confirmation dialog box opens.

  4. Click Load Now. PMF performs the operation and displays a status log, as shown in the following image.

    Note: The log is based on polling the status of the Load Now operation over time. If a load operation takes a long time, multiple rows with the same information might be shown in the status log.


WebFOCUS