Creating an iWay Service Manager Slider Package

A Slider application package is a directory structure compressed with zip that contains the artifacts necessary to deploy and run the application. To deploy iSM in YARN, we create a Slider package that contains the iSM application, some Slider configuration files, and scripts that Slider will use to manage the application. The completed package described in the following sections will look like this:

[iway7@sandbox iway7master]$ unzip -l "$@" iway7master.zip
Archive:  iway7master.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     1171  11-04-2015 16:46   metainfo.xml
        0  08-20-2015 14:25   package/
        0  08-20-2015 14:26   package/files/
190909934  11-04-2015 17:44   package/files/iway7.tar.gz
        0  11-10-2015 17:51   package/scripts/
     1683  11-10-2015 17:51   package/scripts/iway7master.py
      487  11-04-2015 16:29   package/scripts/params.py
      530  11-04-2015 17:48   appConfig-default.json
      277  10-24-2015 15:36   resources-default.json
      530  11-04-2015 17:48   appConfig_testconfig.json
  1. Configure iSM (a configuration or iWay Integration Application (iIA)) to build the application that will be deployed into the Hadoop cluster. We recommend installing standalone iSM for Windows or Linux on a node in the cluster and using the iWay Integration Tools to develop, test, and deploy your application to the standalone installation. This makes it easier to determine the library dependencies and configuration requirements for the components used in the application.

    In the Slider package, we can define Java system properties that will be available to the running application. These can be used to set parameters within the iSM application. Use the _SREG iFL function to reference these system properties. For example, a system property might specify the output directory for Avro files. If the property were called myapp.avro.outputdir, it could be accessed by the Avro emitter as _SREG("myapp.avro.outputdir").

  2. Create a tarball (tar archive) of the iWay home directory, including all of the files that are required to run your application.
    tar cvfz iway7.tar.gz iway7/

    In most cases, many files from the iWay installation need not be included. For example, the etc/setup directory and its contents, or jar files from the lib directory and are not required by components in the application. If resources in the Hadoop cluster are limited, we suggest you test with the standalone application to determine which files can be omitted.

  3. Create a directory structure for the package. In this example, the package will be named iway7master, so for convenience, we can create the package in a directory named iway7master/. Create subdirectories of the package directory named package/files and package/scripts.

    This structure will contain the iSM tarball along with the Slider configuration files and scripts which will be discussed later and will be compressed to create the actual package.

    You do not need to create the package on a host within the Hadoop cluster, though it may be convenient to do so.

  4. Create a file called metainfo.xml in the root of the directory. This file holds basic information about the application, specifies components, and defines exports.
    <metainfo>
      <schemaVersion>2.0</schemaVersion>
      <application>
        <name>IWAY7</name>
        <comment>Provides transformation and enrichment services</comment>
        <version>7.0.4</version>
        <exportedConfigs>None</exportedConfigs>
        <exportGroups>
          <exportGroup>
            <name>Servers</name>
            <exports>
              <export>
                <name>host_port</name>
                <value>${IWAY7_HOST}:${site.global.listen_port}</value>
              </export>
            </exports>
          </exportGroup>
        </exportGroups>
        <components>
          <component>
            <name>IWAY7</name>
            <category>MASTER</category>
            <compExports>Servers-host_port</compExports>
            <commandScript>
              <script>scripts/iway7master.py</script>
              <scriptType>PYTHON</scriptType>
            </commandScript>
          </component>
        </components>
        <osSpecifics>
          <osSpecific>
            <osType>any</osType>
            <packages>
              <package>
                <type>tarball</type>
                <name>files/iway7.tar.gz</name>
              </package>
            </packages>
          </osSpecific>
        </osSpecifics>
      </application>
    </metainfo>

    The metainfo must provide information about the tarball-here, files/iway7.tar.gz and the command script-scripts/iway7master.py. In this example, we also export a value as Servers-host-port. Exporting a value makes it available through Slider's RESTful API. In this case, the exported value is the host on which the application will be running and the console port dynamically assigned by YARN.

    In the above example, we also declare that our application has only one component, which we name IWAY7.

    For more information about the metainfo specification, see:

    http://slider.incubator.apache.org/docs/slider_specs/application_definition.html

  5. Create the resources-default.json file. The file specifies the default YARN resource requirements for the application, including CPU and memory requirements, placement, and failure policy. When the application is created, the resources-default.json file can be used as is or the user can edited to customize resource settings in a specific environment.
    {
      "schema" : "http://example.org/specification/v2.0.0",
      "metadata" : {
      },
      "global" : {
      },
      "components": {
        "slider-appmaster": {
        },
        "IWAY7": {
          "yarn.role.priority": "1",
          "yarn.component.instances": "1",
    "yarn.component.placement.policy": "1",
    "yarn.vcores" : "1",
          "yarn.memory": "1024"
        }
      }
    }

    The resource specification must always include the slider-appmaster component. This is the Slider process that mediates between the iSM application and YARN. In this example, we accept the default settings for the slider-appmaster. Resources for iSM are assigned to the IWAY7 component. In this case, we ask for 1024 MB of memory and a single component instance. By setting the component placement policy to 1, we are requesting that, in case of a failure, YARN restart the iSM application on the same node in the cluster where it was running originally. This will ensure that services in the application like SOAP or HTTP listeners will be available at the same URL.

    The yarn.vcores and yarn.memory settings define how much memory and CPU capacity the YARN container for the iSM application requires. YARN will queue requests until there is enough capacity available within the cluster to satisfy them. When there is capacity, YARN allocates a container to Slider, which then deploys an instance of the iSM application.

    For more information on the resource specification, see:

    http://slider.incubator.apache.org/docs/configuration/resources.html

  6. Create the default configuration template, normally called appConfig-default.json. This is another JSON file that provides a few mandatory parameters and also can supply application specific parameters to the command script. If your iSM application requires system properties, this is where you are going to set them. As with the resources file, when the application is created, the default configuration template can be used as is, or can be customized for the user's environment.
    {
      "schema": "http://example.org/specification/v2.0.0",
      "metadata": {
      },
      "global": {
        "application.def": ".slider/package/IWAY7/iway7master.zip",
        "java_home": "/usr/lib/jvm/java-1.7.0-openjdk.x86_64",
        "site.global.xmx_val": "1024m",
        "site.global.xms_val": "512m",
        "site.global.listen_port": "${IWAY7.ALLOCATED_PORT}{PER_CONTAINER}",
        "site.global.iia_name": "TestConfig"
      },
      "components": {
        "slider-appmaster": {
          "jvm.heapsize": "1024M"
        }
      }
    }

    The application configuration template must specify application_def as the location where the Slider package is installed and java_home, pointing to the location of java on the node. In most cases, Slider will store the installed package in a subdirectory of the home directory of the user who executes the Slider install command, under the package name specified in the install command, like .slider/package/<package name/.

    Other parameters in this example specify JVM settings and system properties that will be passed to the application by the script. Note the setting of site.global.listen_port. The value ${IWAY7.ALLOCATED_PORT}{PER_CONTAINER} is a special instruction to Slider to request a dynamically allocated port from YARN for this application. Above, in the metainfo file, you can see how we export this value so that users can find the iSM console.

    For more information about application configuration options, see:

    http://slider.incubator.apache.org/docs/slider_specs/application_instance_configuration.html

  7. Basic lifecycle commands for the application are implemented in a Python script. In this example, we actually have two scripts (params.py uses Script.getConfig() to import parameters from the application configuration template for easier consumption by the main script).
    from resource_management import *
    # server configurations
    config = Script.get_config()
    app_root = config['configurations']['global']['app_root']
    java64_home = config['hostLevelParams']['java_home']
    pid_file = config['configurations']['global']['pid_file']
    xmx_val = config['configurations']['global']['xmx_val']
    xms_val = config['configurations']['global']['xms_val']
    port = config['configurations']['global']['listen_port']
    iia_name = config['configurations']['global']['iia_name']

    The app_root and pid_file parameters are supplied by Slider. The others, like iia_name, are defined in the application configuration template as shown above. So our main script can access the value of site.global.iia_name as seen in the template using the variable iia_name.

    The main script, iway7master.py, implements commands to install, start, stop, and report the status of the application.

    #!/usr/bin/env python
    import sys
    from resource_management import *
    class Iway7(Script):
      def install(self, env):
        self.install_packages(env)
        pass
      def configure(self, env):
        import params
        env.set_params(params)
      def start(self, env):
        import params
        env.set_params(params)
        self.configure(env)
            iway7cp =
                 format("{app_root}/iway7/lib/*:{app_root}/iway7/etc/manager/extensions/*:
                         {app_root}/iway7/etc/manager/console/*:
                         /usr/hdp/current/hadoop-hdfs-client/*:
                         /usr/hdp/current/hadoop-hdfs-client/lib/*:
                         /usr/hdp/current/hadoop-client/*:
                         /usr/hdp/current/hadoop-client/lib/*")
        process_cmd = 
                format("{java64_home}/bin/java -Xmx{xmx_val} 
                         -Xms{xms_val} 
                         -Diway.console.port.{iia_name}={port} 
                         -Diwaysoftware.af.idocument=com.ibi.edaqm.XDDocument 
                         -DIWAY7={app_root}/iway7 -cp {iway7cp}  edaqm 
                         -service -srvr.masterconfig {port} 
                         -config {iia_name} >> {app_root}/iway7/bin/service.log &")
        Execute(process_cmd,
            logoutput=False,
            wait_for_finish=False,
            pid_file=params.pid_file,
            poll_after = 5
        )
      def stop(self, env):
        import params
        env.set_params(params)
        iway7cp = 
            format("{app_root}/iway7/lib/*:
                    {app_root}/iway7/etc/manager/extensions/*:
                    {app_root}/iway7/etc/manager/console/*")
        process_cmd = 
              format("{java64_home}/bin/java -DIWAY7={app_root}/iway7  
                       -Diway.console.port.{iia_name}={port} 
                       -cp {iway7cp}  com.ibi.service.edaqmServiceShutdown 
                       -c {iia_name} &")
        Execute(process_cmd,
            logoutput=False,
            wait_for_finish=False
        )
      def status(self, env):
        import params
        env.set_params(params)
        check_process_status(params.pid_file)
    if __name__ == "__main__":
      Iway7().execute()

    The above syntax has been formatted to improve readability.

    The install method tells Slider to expand the given tarball. The stop and start methods configure the classpath for iSM, then execute the requested iSM configuration or iIA as a Java command. Note that the classpath constructed in the start method includes Hadoop libraries, like /usr/hdp/current/hadoop-hdfs-client/lib/*. This is necessary when the application contains components like the Avro listener that depend on these libraries and the required resources are not included in the iSM lib/ directory. The classpath for the stop command can be much shorter, since it needs only to execute the iSM shutdown routine.

    Note also how the start method sets the Java system property IWAY7 using the Python variable app_root and the -D switch on the java command. This technique can be used to pass values from the application template into the iSM application, where they can be accessed using the _SREG function.

    For more information about command scripts, see:

    http://slider.incubator.apache.org/docs/slider_specs/writing_app_command_scripts

  8. Compress the entire Slider package directory structure into a .zip file:
    zip -r iway7master.zip

iWay Software