Toggle navigation
Home
安装部署
Archives
Tags
Heron
环境搭建
storm
2019-05-06 06:51:53
112
0
0
louyj
环境搭建
storm
# Building Heron on CentOS 7 ## Step 1 - Install the required dependencies sudo yum install gcc gcc-c++ kernel-devel wget unzip zlib-devel zip git automake cmake patch libtool -y yum install python-devel -y yum install gmp gmp-devel -y ## Step 2 - Install libunwind from source wget http://download.savannah.gnu.org/releases/libunwind/libunwind-1.1.tar.gz tar xvf libunwind-1.1.tar.gz cd libunwind-1.1 ./configure make sudo make install ## Step 3 - Set the following environment variables export CC=/usr/bin/gcc export CCX=/usr/bin/g++ ## Step 4 - Install JDK [see Install JDK here](http://note.louyj.com/blog/post/louyj/Install-JDK) ## Step 5 - Install Bazel 0.1.2 wget https://github.com/bazelbuild/bazel/releases/download/0.1.2/bazel-0.1.2-installer-linux-x86_64.sh chmod +x bazel-0.1.2-installer-linux-x86_64.sh ./bazel-0.1.2-installer-linux-x86_64.sh --user ## Step 6 - Download heron and compile it cd git clone https://github.com/twitter/heron.git && cd heron ./bazel_configure.py bazel build --config=centos heron/... ## Step 7 - Build the binary packages bazel build --config=centos scripts/packages:binpkgs bazel build --config=centos scripts/packages:tarpkgs This will build the packages below the bazel-bin/scripts/packages/ directory. ## Step 8 - Install Heron using installation scripts cd bazel-bin/scripts/packages/ ./heron-client-install.sh --help ./heron-client-install.sh --user ./heron-api-install.sh --user ./heron-tools-install.sh --user or using prefix cd bazel-bin/scripts/packages/ ./heron-client-install.sh --prefix=/root/software/heron ./heron-tools-install.sh --prefix=/root/software/heron ./heron-api-install.sh --prefix=/root/software/heron # Launch topology in local mode (single node) ## Step 1 — Launch an example topology export JAVA_HOME=/opt/jdk1.8.0_91 heron submit local ~/.heron/examples/heron-examples.jar com.twitter.heron.examples.ExclamationTopology ExclamationTopology --deploy-deactivated This will submit the topology to your locally running Heron cluster but it won’t activate the topology. Note the output shows if the topology has been launched successfully and the working directory. To check what’s under the working directory, run: ls /root/.herondata/topologies/local/root/ExclamationTopology All instances’ log files can be found in log-files under the working directory: ls /root/.herondata/topologies/local/root/ExclamationTopology/log-files ## Step 2 — Start Heron Tracker The Heron Tracker is a web service that continuously gathers information about your Heron cluster. You can launch the tracker by running the heron-tracker command. heron-tracker --port 8888 You can reach Heron Tracker in your browser at http://localhost:8888 and see something like the following upon successful submission of the topology. ## Step 3 — Start Heron UI Heron UI is a user interface that uses Heron Tracker to provide detailed visual representations of your Heron topologies. To launch Heron UI: heron-ui --port=8889 --tracker_url="http://localhost:8888" You can open Heron UI in your browser at http://localhost:8889 and see something like this upon successful submission of the topology ## Step 4 — Explore topology management commands In step 1 you submitted a topology to your local cluster. The heron CLI tool also enables you to activate, deactivate, and kill topologies and more. heron activate local ExclamationTopology heron deactivate local ExclamationTopology heron kill local ExclamationTopology # Deploying Heron (multi node) A Heron deployment requires several components working together. The following must be deployed to run Heron topologies in a cluster: - Scheduler — Heron requires a scheduler to run its topologies. It can be deployed on an existing cluster running alongside other big data frameworks. Alternatively, it can be deployed on a cluster of its own. Heron currently supports several scheduler options: Aurora, Local or Slurm - State Manager — Heron state manager tracks the state of all deployed topologies. The topology state includes its logical plan, physical plan, and execution state. Heron supports the following state managers: Local File System or Zookeeper - Uploader — The Heron uploader distributes the topology jars to the servers that run them. Heron supports several uploaders: HDFS, Local File System or Amazon S3 - Metrics Sinks — Heron collects several metrics during topology execution. These metrics can be routed to a sink for storage and offline analysis. Currently, Heron supports the following sinks: File Sink, Graphite Sink or Scribe Sink - Heron Tracker — Tracker serves as the gateway to explore the topologies. It exposes a REST API for exploring logical plan, physical plan of the topologies and also for fetching metrics from them. - Heron UI — The UI provides the ability to find and explore topologies visually. UI displays the DAG of the topology and how the DAG is mapped to physical containers running in clusters. Furthermore, it allows the ability to view logs, take heap dump, memory histograms, show metrics, etc. ## Step 1 - Setting Up ZooKeeper State Manager Heron relies on ZooKeeper for a wide variety of cluster coordination tasks. You can use either a shared or dedicated ZooKeeper cluster. There are a few things you should be aware of regarding Heron and ZooKeeper: Heron uses ZooKeeper only for coordination, not for message passing, which means that ZooKeeper load should generally be fairly low. A single-node and/or shared ZooKeeper may suffice for your Heron cluster, depending on usage. Heron uses ZooKeeper more efficiently than Storm. This makes Heron less likely than Storm to require a bulky or dedicated ZooKeeper cluster, but your use case may require one. We strongly recommend running ZooKeeper under supervision. ### ZooKeeper State Manager Configuration You can make Heron aware of the ZooKeeper cluster by modifying the `/root/.heron/conf/aurora/statemgr.yaml` config file specific for the Heron cluster. You’ll need to specify the following for each cluster: - `heron.class.state.manager` — Indicates the class to be loaded for managing the state in ZooKeeper and this class is loaded using reflection. You should set this to com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager - `heron.statemgr.connection.string` — The host IP address and port to connect to ZooKeeper cluster (e.g) “127.0.0.1:2181”. - `heron.statemgr.root.path` — The root ZooKeeper node to be used by Heron. We recommend providing Heron with an exclusive root node; if you do not, make sure that the following child nodes are unused: /tmasters, /topologies, /pplans, /executionstate, /schedulers. - `heron.statemgr.zookeeper.is.initialize.tree` — Indicates whether the nodes under ZooKeeper root /tmasters, /topologies, /pplans, /executionstate, and /schedulers need to created, if they are not found. Set it to True if you could like Heron to create those nodes. If those nodes are already there, set it to False. The absence of this configuration implies True. - `heron.statemgr.zookeeper.session.timeout.ms` — Specifies how much time in milliseconds to wait before declaring the ZooKeeper session is dead. - `heron.statemgr.zookeeper.connection.timeout.ms` — Specifies how much time in milliseconds to wait before the connection to ZooKeeper is dead. - `heron.statemgr.zookeeper.retry.count` — Count of the number of retry attempts to connect to ZooKeeper - `heron.statemgr.zookeeper.retry.interval.ms`: Time in milliseconds to wait between each retry ### Example ZooKeeper State Manager Configuration # local state manager class for managing state in a persistent fashion heron.class.state.manager: com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager # local state manager connection string heron.statemgr.connection.string: "louyj.top:2181" # path of the root address to store the state in a local file system heron.statemgr.root.path: "/heron001" # create the zookeeper nodes, if they do not exist heron.statemgr.zookeeper.is.initialize.tree: True # timeout in ms to wait before considering zookeeper session is dead heron.statemgr.zookeeper.session.timeout.ms: 30000 # timeout in ms to wait before considering zookeeper connection is dead heron.statemgr.zookeeper.connection.timeout.ms: 30000 # timeout in ms to wait before considering zookeeper connection is dead heron.statemgr.zookeeper.retry.count: 10 # duration of time to wait until the next retry heron.statemgr.zookeeper.retry.interval.ms: 10000 ## Step 2 - Setting Up Aurora Cluster (Scheduler) Aurora doesn’t have a Heron scheduler per se. Instead, when a topology is submitted to Heron, heron cli interacts with Aurora to automatically deploy all the components necessary to manage topologies. ### ZooKeeper To run Heron on Aurora, you’ll need to set up a ZooKeeper cluster and configure Heron to communicate with it. ### Hosting Binaries o deploy Heron, the Aurora cluster needs access to the Heron core binary, which can be hosted wherever you’d like, so long as it’s accessible to Aurora. Once your Heron binaries are hosted somewhere that is accessible to Aurora, you should run tests to ensure that Aurora can successfully fetch them. ### Aurora Scheduler Configuration To configure Heron to use Aurora scheduler, modify the scheduler.yaml config file specific for the Heron cluster. The following must be specified for each cluster: - `heron.class.scheduler` — Indicates the class to be loaded for Aurora scheduler. You should set this to com.twitter.heron.scheduler.aurora.AuroraScheduler - `heron.class.launcher` — Specifies the class to be loaded for launching and submitting topologies. To configure the Aurora launcher, set this to com.twitter.heron.scheduler.aurora.AuroraLauncher - `heron.package.core.uri` — Indicates the location of the heron core binary package. The local scheduler uses this URI to download the core package to the working directory. - `heron.directory.sandbox.java.home` — Specifies the java home to be used when running topologies in the containers. - `heron.scheduler.is.service` — This config indicates whether the scheduler is a service. In the case of Aurora, it should be set to False. ### Example Aurora Scheduler Configuration # scheduler class for distributing the topology for execution heron.class.scheduler: com.twitter.heron.scheduler.aurora.AuroraScheduler # launcher class for submitting and launching the topology heron.class.launcher: com.twitter.heron.scheduler.aurora.AuroraLauncher # location of the core package heron.package.core.uri: file:///root/heron/bazel-bin/scripts/packages/heron-core.tar.gz # location of java - pick it up from shell environment heron.directory.sandbox.java.home: /opt/jdk1.8.0_91/ # Invoke the IScheduler as a library directly heron.scheduler.is.service: False ### Working with Topologies After setting up ZooKeeper and generating an Aurora-accessible Heron core binary release, any machine that has the heron cli tool can be used to manage Heron topologies (i.e. can submit topologies, activate and deactivate them, etc.). The most important thing at this stage is to ensure that heron cli is available across all machines. Once the cli is available, Aurora as a scheduler can be enabled by specifying the proper configuration when managing topologies. ## Step 3 - Setting Up Local File System Uploader When you submit a topology to Heron, the topology jars will be uploaded to a stable location. The submitter will provide this location to the scheduler and it will pass it to the executor each container. Heron can use a local file system as a stable storage for topology jar distribution. There are a few things you should be aware of local file system uploader: - Local file system uploader is mainly used in conjunction with local scheduler. - It is ideal, if you want to run Heron in a single server, laptop or an edge device. - Useful for Heron developers for local testing of the components. ### Local File System Uploader Configuration You can make Heron aware of the local file system uploader by modifying the uploader.yaml config file specific for the Heron cluster. You’ll need to specify the following for each cluster: - `heron.class.uploader` — Indicate the uploader class to be loaded. You should set this to `com.twitter.heron.uploader.localfs.LocalFileSystemUploader` - `heron.uploader.localfs.file.system.directory` — Provides the name of the directory where the topology jar should be uploaded. The name of the directory should be unique per cluster You could use the Heron environment variables ${CLUSTER} that will be substituted by cluster name. ### Example Local File System Uploader Configuration # uploader class for transferring the topology jar/tar files to storage heron.class.uploader: com.twitter.heron.uploader.localfs.LocalFileSystemUploader # name of the directory to upload topologies for local file system uploader heron.uploader.localfs.file.system.directory: ${HOME}/.herondata/topologies/${CLUSTER}
Pre:
Storm集群搭建
Next:
CDH5 Installation Path-C
0
likes
112
Weibo
Wechat
Tencent Weibo
QQ Zone
RenRen
Submit
Sign in
to leave a comment.
No Leanote account?
Sign up now.
0
comments
More...
Table of content
No Leanote account? Sign up now.