Toggle navigation
Home
安装部署
Archives
Tags
Presto Installation
2019-05-06 06:51:53
116
0
0
louyj
#Installing Presto Download the Presto server tarball `https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.203/presto-server-0.203.tar.gz` and unpack it. The tarball will contain a single top-level directory, presto-server-0.203, which we will call the installation directory. #Configuring Presto Create an etc directory inside the installation directory. This will hold the following configuration: - Node Properties: environmental configuration specific to each node - JVM Config: command line options for the Java Virtual Machine - Config Properties: configuration for the Presto server - Catalog Properties: configuration for Connectors (data sources) ##Node Properties The node properties file, `etc/node.properties`, contains configuration specific to each node. A node is a single installed instance of Presto on a machine. This file is typically created by the deployment system when Presto is first installed. The following is a minimal `etc/node.properties`: ``` node.environment=production node.id=ffffffff-ffff-ffff-ffff-ffffffffffff node.data-dir=/var/presto/data ``` The above properties are described below: - `node.environment`: The name of the environment. All Presto nodes in a cluster must have the same environment name. - `node.id`: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier. - `node.data-dir`: The location (filesystem path) of the data directory. Presto will store logs and other data here. ##JVM Config The JVM config file, `etc/jvm.config`, contains a list of command line options used for launching the Java Virtual Machine. The format of the file is a list of options, one per line. These options are not interpreted by the shell, so options containing spaces or other special characters should not be quoted. The following provides a good starting point for creating `etc/jvm.config`: ``` -server -Xmx16G -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError ``` Because an OutOfMemoryError will typically leave the JVM in an inconsistent state, we write a heap dump (for debugging) and forcibly terminate the process when this occurs. ##Config Properties The config properties file, `etc/config.properties`, contains the configuration for the Presto server. Every Presto server can function as both a coordinator and a worker, but dedicating a single machine to only perform coordination work provides the best performance on larger clusters. The following is a minimal configuration for the coordinator: ``` coordinator=true node-scheduler.include-coordinator=false http-server.http.port=8080 query.max-memory=50GB query.max-memory-per-node=1GB discovery-server.enabled=true discovery.uri=http://example.net:8080 ``` And this is a minimal configuration for the workers: ``` coordinator=false http-server.http.port=8080 query.max-memory=50GB query.max-memory-per-node=1GB discovery.uri=http://example.net:8080 ``` Alternatively, if you are setting up a single machine for testing that will function as both a coordinator and worker, use this configuration: ``` coordinator=true node-scheduler.include-coordinator=true http-server.http.port=8080 query.max-memory=5GB query.max-memory-per-node=1GB discovery-server.enabled=true discovery.uri=http://example.net:8080 ``` These properties require some explanation: - `coordinator`: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution). - `node-scheduler.include-coordinator`: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution. - `http-server.http.port`: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external. - `query.max-memory`: The maximum amount of distributed memory that a query may use. - `query.max-memory-per-node`: The maximum amount of memory that a query may use on any one machine. - `discovery-server.enabled`: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port. - `discovery.uri`: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace example.net:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash. You may also wish to set the following properties: - `mx.rmiregistry.port`: Specifies the port for the JMX RMI registry. JMX clients should connect to this port. - `jmx.rmiserver.port`: Specifies the port for the JMX RMI server. Presto exports many metrics that are useful for monitoring via JMX. ##Log Levels The optional log levels file, `etc/log.properties`, allows setting the minimum log level for named logger hierarchies. Every logger has a name, which is typically the fully qualified name of the class that uses the logger. Loggers have a hierarchy based on the dots in the name (like Java packages). For example, consider the following log levels file: `com.facebook.presto=INFO` This would set the minimum level to INFO for both com.facebook.presto.server and com.facebook.presto.hive. The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR. ##Catalog Properties Presto accesses data via connectors, which are mounted in catalogs. The connector provides all of the schemas and tables inside of the catalog. For example, the Hive connector maps each Hive database to a schema, so if the Hive connector is mounted as the `hive` catalog, and Hive contains a table `clicks` in database `web`, that table would be accessed in Presto as `hive.web.clicks`. Catalogs are registered by creating a catalog properties file in the `etc/catalog` directory. For example, create `etc/catalog/jmx.properties` with the following contents to mount the `jmx` connector as the `jmx` catalog: ``` connector.name=jmx ``` #Running Presto The installation directory contains the launcher script in `bin/launcher`. Presto can be started as a daemon by running the following: ``` bin/launcher start ``` Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools): `bin/launcher run` After launching, you can find the log files in `var/log`: #Command Line Interface The Presto CLI provides a terminal-based interactive shell for running queries. The CLI is a self-executing JAR file, which means it acts like a normal UNIX executable. Download `presto-cli-0.203-executable.jar`, rename it to presto, make it executable with `chmod +x`, then run it: https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.203/presto-cli-0.203-executable.jar ./presto --server localhost:8080 --catalog hive --schema default By default, the results of queries are paginated using the less program which is configured with a carefully selected set of options. This behavior can be overridden by setting the environment variable `PRESTO_PAGER` to the name of a different program such as `more`, or set it to an empty value to completely disable pagination. #JDBC Driver Presto can be accessed from Java using the JDBC driver. Download presto-jdbc-0.203.jar and add it to the class path of your Java application. The driver is also available from Maven Central: ``` <dependency> <groupId>com.facebook.presto</groupId> <artifactId>presto-jdbc</artifactId> <version>0.203</version> </dependency> ``` The following JDBC URL formats are supported: ``` jdbc:presto://host:port jdbc:presto://host:port/catalog jdbc:presto://host:port/catalog/schema ``` For example, use the following URL to connect to Presto running on `example.net` port `8080` with the catalog `hive` and the schema `sales`: jdbc:presto://example.net:8080/hive/sales dbeaver链接如下 
Pre:
阿里云oss挂载
Next:
ELK
0
likes
116
Weibo
Wechat
Tencent Weibo
QQ Zone
RenRen
Submit
Sign in
to leave a comment.
No Leanote account?
Sign up now.
0
comments
More...
Table of content
No Leanote account? Sign up now.