Toggle navigation
Home
安装部署
Archives
Tags
Mount HDFS by NFS
环境搭建
hadoop
2019-05-06 06:51:53
434
0
0
louyj
环境搭建
hadoop
# Overview The NFS Gateway supports NFSv3 and allows HDFS to be mounted as part of the client's local file system. Currently NFS Gateway supports and enables the following usage patterns: - Users can browse the HDFS file system through their local file system on NFSv3 client compatible operating systems. - Users can download files from the the HDFS file system on to their local file system. - Users can upload files from their local file system directly to the HDFS file system. - Users can stream data directly to HDFS through the mount point. File append is supported but random write is not supported. The NFS gateway machine needs the same thing to run an HDFS client like Hadoop JAR files, HADOOP_CONF directory. The NFS gateway can be on the same host as DataNode, NameNode, or any HDFS client. # Configuration in core-site.xml of the namenode, the following must be set( in non-secure mode) <property> <name>hadoop.proxyuser.nfsserver.groups</name> <value>nfs-users1,nfs-users2</value> <description> The 'nfsserver' user is allowed to proxy all members of the 'nfs-users1' and 'nfs-users2' groups. Set this to '*' to allow nfsserver user to proxy any group. </description> </property> <property> <name>hadoop.proxyuser.nfsserver.hosts</name> <value>nfs-client-host1.com</value> <description> This is the host where the nfs gateway is running. Set this to '*' to allow requests from any hosts to be proxied. </description> </property> **Tip: nfsserver is the user who start the nfs service.** For Kerberized hadoop clusters, the following configurations need to be added to hdfs-site.xml <property> <name>nfs.keytab.file</name> <value>/etc/hadoop/conf/nfsserver.keytab</value> <!-- path to the nfs gateway keytab --> </property> <property> <name>nfs.kerberos.principal</name> <value>nfsserver/_HOST@YOUR-REALM.COM</value> </property> Users are expected to update the file dump directory. NFS client often reorders writes. Sequential writes can arrive at the NFS gateway at random order. This directory is used to temporarily save out-of-order writes before writing to HDFS. For each file, the out-of-order writes are dumped after they are accumulated to exceed certain threshold (e.g., 1MB) in memory. One needs to make sure the directory has enough space. For example, if the application uploads 10 files with each having 100MB, it is recommended for this directory to have roughly 1GB space in case if a worst-case write reorder happens to every file. Only NFS gateway needs to restart after this property is updated. <property> <name>nfs.dump.dir</name> <value>/tmp/.hdfs-nfs</value> </property> By default, the export can be mounted by any client. To better control the access, users can update the following property. The value string contains machine name and access privilege, separated by whitespace characters. The machine name format can be a single host, a Java regular expression, or an IPv4 address. The access privilege uses rw or ro to specify read/write or read-only access of the machines to exports. If the access privilege is not provided, the default is read-only. Entries are separated by ";". For example: "192.168.0.0/22 rw ; host.*\.example\.com ; host1.test.org ro;". Only the NFS gateway needs to restart after this property is updated. <property> <name>nfs.exports.allowed.hosts</name> <value>* rw</value> </property> # Start and stop NFS gateway service Three daemons are required to provide NFS service: rpcbind (or portmap), mountd and nfsd. The NFS gateway process has both nfsd and mountd. It shares the HDFS root "/" as the only export. It is recommended to use the portmap included in NFS gateway package. Even though NFS gateway works with portmap/rpcbind provide by most Linux distributions, the package included portmap is needed on some Linux systems such as REHL6.2 due to an rpcbind bug yum install nfs-utils 1. Stop nfs/rpcbind/portmap services provided by the platform (commands can be different on various Unix platforms): service nfs stop service rpcbind stop 2. Start package included portmap (needs root privileges): hadoop portmap OR hadoop-daemon.sh start portmap 3. Start mountd and nfsd. No root privileges are required for this command. However, ensure that the user starting the Hadoop cluster and the user starting the NFS gateway are same. hadoop nfs3 OR hadoop-daemon.sh start nfs3 Note, if the hadoop-daemon.sh script starts the NFS gateway, its log can be found in the hadoop log folder. 4. Stop NFS gateway services. hadoop-daemon.sh stop nfs3 hadoop-daemon.sh stop portmap Optionally, you can forgo running the Hadoop-provided portmap daemon and instead use the system portmap daemon on all operating systems if you start the NFS Gateway as root. This will allow the HDFS NFS Gateway to work around the aforementioned bug and still register using the system portmap daemon. To do so, just start the NFS gateway daemon as you normally would, but make sure to do so as the "root" user, and also set the "HADOOP_PRIVILEGED_NFS_USER" environment variable to an unprivileged user. In this mode the NFS Gateway will start as root to perform its initial registration with the system portmap, and then will drop privileges back to the user specified by the HADOOP_PRIVILEGED_NFS_USER afterward and for the rest of the duration of the lifetime of the NFS Gateway process. Note that if you choose this route, you should skip steps 1 and 2 above. # Verify validity of NFS related services 1. Execute the following command to verify if all the services are up and running: rpcinfo -p $nfs_server_ip You should see output similar to the following: program vers proto port 100005 1 tcp 4242 mountd 100005 2 udp 4242 mountd 100005 2 tcp 4242 mountd 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100005 3 udp 4242 mountd 100005 1 udp 4242 mountd 100003 3 tcp 2049 nfs 100005 3 tcp 4242 mountd 2. Verify if the HDFS namespace is exported and can be mounted. showmount -e $nfs_server_ip You should see output similar to the following: Exports list on $nfs_server_ip : / (everyone) # Mount the export “/” Currently NFS v3 only uses TCP as the transportation protocol. NLM is not supported so mount option "nolock" is needed. It's recommended to use hard mount. This is because, even after the client sends all data to NFS gateway, it may take NFS gateway some extra time to transfer data to HDFS when writes were reorderd by NFS client Kernel. If soft mount has to be used, the user should give it a relatively long timeout (at least no less than the default timeout on the host) . The users can mount the HDFS namespace as shown below: mount -t nfs -o vers=3,proto=tcp,nolock,noacl $server:/ $mount_point Then the users can access HDFS as part of the local file system except that, hard link and random write are not supported yet. # uMount fuser -k /hdfs //kill all process use this directory umount -f /hdfs
Pre:
CDH5 Installation Path-B
Next:
Spark Installation
0
likes
434
Weibo
Wechat
Tencent Weibo
QQ Zone
RenRen
Submit
Sign in
to leave a comment.
No Leanote account?
Sign up now.
0
comments
More...
Table of content
No Leanote account? Sign up now.