Constraints

You can seldom trust the data you have because there can be network problems during import, or the program that was generated was wrongly parameterized, the program got invalid input, or the device you used to collect the data was used out of its operating conditions. For these reasons, it is a good practice to find constraints and check them after import or more complex transformations. You should also check the user input, and if it might cause hard-to-discover problems in later phases, report them as soon as you can.

The Flow Control/Switches nodes can be used to enable the workflow parts selectively (this is useful if the check of constraints is not always required, or it is too time consuming to be on by default or to try correcting the wrong data), but the loop-related nodes (Flow Control/Loop Support) are also useful when multiple columns should be tested and can handle complex conditions.

Constraints

In the preceding screenshot, a flow variable comes from outside of the meta node, the Java Edit Variable (simple) node transforms it, and the result goes to the Counting Loop Start node, where it can be used to set the parameters.

The IF Switch node is not really helpful in this regard, but when you create mock/artificial test data you can specify whether that should be merged to the normal data or not. The actual merge can be done by either the End IF node or one of the Concatenate nodes.

The CASE Switch node works similarly with just three possible states (outputs) and better support for workflow variables in the switch condition. The join operation of the case switch can be performed to signal possible errors (End (Model) CASE) when there are more than one active branches, or just concatenate them (End CASE).

The Java IF node and the Empty Table Switch node are more automated. They depend on the state of the input on the branching node too, not just during the join. The latter simply forwards the data to the first output port if the input is not empty (has rows), else it forwards the data to the second output port. On the other hand, the Java IF node can use flow variables and other states (such as the current date and the random number generators) to select the first or second port as the destination for the input.

For example, when you remove the rows that contain missing values and no rows remain, the Empty Table Switch node might give you an alternative path to handle that situation, and yet finish the execution of the workflow. The Row Filter node can also be used in combination with it to check whether a certain number of rows are available or not.

When you want to signal an error, the best option is the Breakpoint node because it was designed for this purpose. You specify whether an empty table, an active or inactive branch, or a certain flow variable value is the erroneous condition, and if it is satisfied, the execution of the node will fail.

The Try and Catch Errors family of nodes in the Error Handling category is useful when you want to handle the failures of the nodes in an alternative way.

Obviously, a Java Snippet node can be used to signal an error if the condition does not require more context than a row, but it is not ideal to collect the "bad" rows. For this purpose, the Java Snippet Row Filter node is a better choice. When it is combined with the previous constructs, you can create complex error-handling scenarios.

Some of the metadata of a table can be converted to another table using the Extract Table Dimension and the Extract Table Spec nodes. The former just computes how many rows and columns are there, but the latter extracts the min-max values, types, and column names for the input table.

The Set Operator node can be used to compare different tables; for example, if you have possibly removed the rows (with the Missing Values node), you can check whether the difference to the original table is an empty table or not with the Breakpoint node.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.234.188