A Slider application package is a directory structure compressed with zip that contains the artifacts necessary to deploy and run the application. To deploy iSM in YARN, we create a Slider package that contains the iSM application, some Slider configuration files, and scripts that Slider will use to manage the application. The completed package described in the following sections will look like this:
[iway7@sandbox iway7master]$ unzip -l "$@" iway7master.zip Archive: iway7master.zip Length Date Time Name --------- ---------- ----- ---- 1171 11-04-2015 16:46 metainfo.xml 0 08-20-2015 14:25 package/ 0 08-20-2015 14:26 package/files/ 190909934 11-04-2015 17:44 package/files/iway7.tar.gz 0 11-10-2015 17:51 package/scripts/ 1683 11-10-2015 17:51 package/scripts/iway7master.py 487 11-04-2015 16:29 package/scripts/params.py 530 11-04-2015 17:48 appConfig-default.json 277 10-24-2015 15:36 resources-default.json 530 11-04-2015 17:48 appConfig_testconfig.json
In the Slider package, we can define Java system properties that will be available to the running application. These can be used to set parameters within the iSM application. Use the _SREG iFL function to reference these system properties. For example, a system property might specify the output directory for Avro files. If the property were called myapp.avro.outputdir, it could be accessed by the Avro emitter as _SREG("myapp.avro.outputdir").
tar cvfz iway7.tar.gz iway7/
In most cases, many files from the iWay installation need not be included. For example, the etc/setup directory and its contents, or jar files from the lib directory and are not required by components in the application. If resources in the Hadoop cluster are limited, we suggest you test with the standalone application to determine which files can be omitted.
This structure will contain the iSM tarball along with the Slider configuration files and scripts which will be discussed later and will be compressed to create the actual package.
You do not need to create the package on a host within the Hadoop cluster, though it may be convenient to do so.
<metainfo> <schemaVersion>2.0</schemaVersion> <application> <name>IWAY7</name> <comment>Provides transformation and enrichment services</comment> <version>7.0.4</version> <exportedConfigs>None</exportedConfigs> <exportGroups> <exportGroup> <name>Servers</name> <exports> <export> <name>host_port</name> <value>${IWAY7_HOST}:${site.global.listen_port}</value> </export> </exports> </exportGroup> </exportGroups> <components> <component> <name>IWAY7</name> <category>MASTER</category> <compExports>Servers-host_port</compExports> <commandScript> <script>scripts/iway7master.py</script> <scriptType>PYTHON</scriptType> </commandScript> </component> </components> <osSpecifics> <osSpecific> <osType>any</osType> <packages> <package> <type>tarball</type> <name>files/iway7.tar.gz</name> </package> </packages> </osSpecific> </osSpecifics> </application> </metainfo>
The metainfo must provide information about the tarball-here, files/iway7.tar.gz and the command script-scripts/iway7master.py. In this example, we also export a value as Servers-host-port. Exporting a value makes it available through Slider's RESTful API. In this case, the exported value is the host on which the application will be running and the console port dynamically assigned by YARN.
In the above example, we also declare that our application has only one component, which we name IWAY7.
For more information about the metainfo specification, see:
http://slider.incubator.apache.org/docs/slider_specs/application_definition.html
{ "schema" : "http://example.org/specification/v2.0.0", "metadata" : { }, "global" : { }, "components": { "slider-appmaster": { }, "IWAY7": { "yarn.role.priority": "1", "yarn.component.instances": "1", "yarn.component.placement.policy": "1", "yarn.vcores" : "1", "yarn.memory": "1024" } } }
The resource specification must always include the slider-appmaster component. This is the Slider process that mediates between the iSM application and YARN. In this example, we accept the default settings for the slider-appmaster. Resources for iSM are assigned to the IWAY7 component. In this case, we ask for 1024 MB of memory and a single component instance. By setting the component placement policy to 1, we are requesting that, in case of a failure, YARN restart the iSM application on the same node in the cluster where it was running originally. This will ensure that services in the application like SOAP or HTTP listeners will be available at the same URL.
The yarn.vcores and yarn.memory settings define how much memory and CPU capacity the YARN container for the iSM application requires. YARN will queue requests until there is enough capacity available within the cluster to satisfy them. When there is capacity, YARN allocates a container to Slider, which then deploys an instance of the iSM application.
For more information on the resource specification, see:
http://slider.incubator.apache.org/docs/configuration/resources.html
{ "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { "application.def": ".slider/package/IWAY7/iway7master.zip", "java_home": "/usr/lib/jvm/java-1.7.0-openjdk.x86_64", "site.global.xmx_val": "1024m", "site.global.xms_val": "512m", "site.global.listen_port": "${IWAY7.ALLOCATED_PORT}{PER_CONTAINER}", "site.global.iia_name": "TestConfig" }, "components": { "slider-appmaster": { "jvm.heapsize": "1024M" } } }
The application configuration template must specify application_def as the location where the Slider package is installed and java_home, pointing to the location of java on the node. In most cases, Slider will store the installed package in a subdirectory of the home directory of the user who executes the Slider install command, under the package name specified in the install command, like .slider/package/<package name/.
Other parameters in this example specify JVM settings and system properties that will be passed to the application by the script. Note the setting of site.global.listen_port. The value ${IWAY7.ALLOCATED_PORT}{PER_CONTAINER} is a special instruction to Slider to request a dynamically allocated port from YARN for this application. Above, in the metainfo file, you can see how we export this value so that users can find the iSM console.
For more information about application configuration options, see:
http://slider.incubator.apache.org/docs/slider_specs/application_instance_configuration.html
from resource_management import * # server configurations config = Script.get_config() app_root = config['configurations']['global']['app_root'] java64_home = config['hostLevelParams']['java_home'] pid_file = config['configurations']['global']['pid_file'] xmx_val = config['configurations']['global']['xmx_val'] xms_val = config['configurations']['global']['xms_val'] port = config['configurations']['global']['listen_port'] iia_name = config['configurations']['global']['iia_name']
The app_root and pid_file parameters are supplied by Slider. The others, like iia_name, are defined in the application configuration template as shown above. So our main script can access the value of site.global.iia_name as seen in the template using the variable iia_name.
The main script, iway7master.py, implements commands to install, start, stop, and report the status of the application.
#!/usr/bin/env python import sys from resource_management import * class Iway7(Script): def install(self, env): self.install_packages(env) pass def configure(self, env): import params env.set_params(params) def start(self, env): import params env.set_params(params) self.configure(env) iway7cp = format("{app_root}/iway7/lib/*:{app_root}/iway7/etc/manager/extensions/*: {app_root}/iway7/etc/manager/console/*: /usr/hdp/current/hadoop-hdfs-client/*: /usr/hdp/current/hadoop-hdfs-client/lib/*: /usr/hdp/current/hadoop-client/*: /usr/hdp/current/hadoop-client/lib/*") process_cmd = format("{java64_home}/bin/java -Xmx{xmx_val} -Xms{xms_val} -Diway.console.port.{iia_name}={port} -Diwaysoftware.af.idocument=com.ibi.edaqm.XDDocument -DIWAY7={app_root}/iway7 -cp {iway7cp} edaqm -service -srvr.masterconfig {port} -config {iia_name} >> {app_root}/iway7/bin/service.log &") Execute(process_cmd, logoutput=False, wait_for_finish=False, pid_file=params.pid_file, poll_after = 5 ) def stop(self, env): import params env.set_params(params) iway7cp = format("{app_root}/iway7/lib/*: {app_root}/iway7/etc/manager/extensions/*: {app_root}/iway7/etc/manager/console/*") process_cmd = format("{java64_home}/bin/java -DIWAY7={app_root}/iway7 -Diway.console.port.{iia_name}={port} -cp {iway7cp} com.ibi.service.edaqmServiceShutdown -c {iia_name} &") Execute(process_cmd, logoutput=False, wait_for_finish=False ) def status(self, env): import params env.set_params(params) check_process_status(params.pid_file) if __name__ == "__main__": Iway7().execute()
The above syntax has been formatted to improve readability.
The install method tells Slider to expand the given tarball. The stop and start methods configure the classpath for iSM, then execute the requested iSM configuration or iIA as a Java command. Note that the classpath constructed in the start method includes Hadoop libraries, like /usr/hdp/current/hadoop-hdfs-client/lib/*. This is necessary when the application contains components like the Avro listener that depend on these libraries and the required resources are not included in the iSM lib/ directory. The classpath for the stop command can be much shorter, since it needs only to execute the iSM shutdown routine.
Note also how the start method sets the Java system property IWAY7 using the Python variable app_root and the -D switch on the java command. This technique can be used to pass values from the application template into the iSM application, where they can be accessed using the _SREG function.
For more information about command scripts, see:
http://slider.incubator.apache.org/docs/slider_specs/writing_app_command_scripts
zip -r iway7master.zip
iWay Software |