Ambari学习9_setting up hdp 2.1 with non-standard users for hadoop services
2016-11-10 12:40
676 查看
setting up hdp 2.1 with non-standard users for hadoop services (why not use
a non-standard user for ambari, too)
Skipto end of metadata
Created by Lester Martin, last modified on Jan
08, 2015
Go
to start of metadata
If you find yourself needing to setup Hortonworks Data Platform (HDP) with Ambari in an environment that users and groups need to be pre-provisioned instead of simply created during the install process, then don't fret as Ambari has got you covered. This write-up
piggybacks the HDP Documentation site and uses HDP
2.1.2 along with Ambari 1.5.1 as a baseline to build against. It
will also build a 4-node cluster (2 worker nodes, 1 master node, and 1 node to run Knox on) all running on CentOS 6.5
As this whole post is about using non-standard users and groups, let's refer to http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html which
lists the default user and group names. For our purposes, let's just go and and prefix all of these with "
ryo" (example: user account "
mapred" for the MapReduce2 service will now be "
ryomapred"). Also, we'll want to
install, and execute, Ambari as a user other than root. Using this same general theme, we'll use "
ryoambari" for the user and group name.
To get started, follow the general flow of building a virtualized 5-node HDP 2.0 cluster (all within a mac) to
set up the servers up until you get to the section called Install Cluster via Ambari. And yes, there are faster ways to get to this point such as described in https://blog.codecentric.de/en/2014/04/hadoop-cluster-automation/ using
tools such as Vagrant, but there is value to walking these steps for those shoring up their linux admin skills.
Back to those more lengthy instructions, start out building the First Master Node as described previously, but remember that we want to build CentOS 6.5 boxes (grabbed my ISO here).
For this host name, call the VirtualBox entry
4N-HDP212-M1to differentiate from the older 2.0 cluster. It's OK to reuse the
m1.hdp2hostname.
Work through the same network setup instructions for this first node, but pause for a second when you get to the SSH Setup section. The instructions athttp://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html are
to make sure that Ambari can make password-less SSH connections and are based on the default approach of letting Ambari run as
root. As we'll be using "
ryoambari" (mentioned above) we need to set this up when we are that user. To
do that, we'll need the user in the first place. We can create the user (which creates a group with same name) and set the password to "
hadoop" for simplicity's sake.
states "All new service user accounts, and any existing user accounts used as service users, must have a UID >= 1000." Unfortunately, as described here,
CentOS & RHEL begin their numbering at 500.
That's probably not a big deal for the account we are going to use for Ambari itself, but checking it out & making corrective actions on this single account will allow us later to breeze through this concern in the verification/modification step.
A note is identified in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html that
(in regards to installing/running Ambari) states, "It is possible to use a non-root SSH account, if that account can execute
sudowithout entering a password."
On that note there are great write-ups out there like found here,
but I cheated since this is a dev-only setup (virtualized even within my Mac) and every "real" environment will have a sysadmin who knows how to do this best for their setup. I followed this
thread and just did the following to grant everyone the ability to do password-less
sudocommands.
Now we can circle back and work on the SSH instructions found at http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html,
but do them with
ryoambariinstead of
rootas shown below (i.e. replaying the instructions back in building
a virtualized 5-node HDP 2.0 cluster (all within a mac)).
out, we are going to create users & groups ahead of time. The following commands will create the three groups that we need and, even though there is no special warning about their IDs being >= 1000, we'll do just that.
ryoambariUID was 1500 suggested these new users would have IDs >= 1000 as well and can easily be verified.
keep on chugging through the install docs) after theSSH Setup section and work all the way through Rinse, Lather, and Repeat... as long as you take into account the change to the hosts we are trying to build and set them up
as identified below. We can call the newly created VirtualBox appliance 4N-HDP212-template (with file name of 4N-HDP212-template.ova).
VirtualBox Name | OS Hostname | IP Address |
---|---|---|
4N-HDP212-K1 | k1.hdp2 | 192.168.56.31 |
4N-HDP212-M1 | m1.hdp2 | 192.168.56.41 |
4N-HDP2-W1 | w1.hdp2 | 192.168.56.51 |
4N-HDP2-W2 | w2.hdp2 | 192.168.56.52 |
ryoambarito make all the identified changes to make the box unique (prefixing everything with
sudoof course).
As I was working through the Ensure Connectivity Between Nodes section of the other post I ran into the problem below and thanks to some quick googling I found a quick fix at http://www.omnicode.com/qa.php?id=69 that
did the trick. Unfortunately, I had to do this on each box.
a virtualized 5-node HDP 2.0 cluster (all within a mac). Of course, this is HDP 2.1, not 2.0, but very similar. I'll call out the major differences below.
Just to make my install a bit more interesting, I decided to go down the optional step identified at http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-6.html since
most datacenter-based clusters will not have internet access. Thankfully, I already did all this setup on my
m1.hdp2master node where I will be installing Ambari in an earlier post; using
your mac to install a virtualized hadoop cluster? (then setup a local repo on it). Feel free to just skip that if your fingers are already getting tired on this exercise.
It looks like during the "
sudo ambari-server setup" exercise discussed in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap2-2.html I
lost my chance to use the "
ryopostgres" user I created earlier by not choosing to enter the advanced database configuration. This one is OK as most enterprises will already have a centralized database they want to use anyway and the setup for
that will trump this concern.
If you went down the local repo option like I did, then when you get to http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-2a_2x.html make
sure your screen looks something like the following (i.e. use the URL you created in the prior posting).
There is a subtle callout on http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-3_2x.html for
running Ambari as another user besides
rootand the screenshot below shows what that looks like when declared to run as
ryoambari.
Unfortunately, the Confirm Hosts page showed failures for all nodes with the following messages.
following the information provided I found out that I needed a newer version of OpenSSL which was accomplished by running "
sudo yum upgrade openssl" on all the boxes to get past these errors.
I also found out that I had a warning from each of the four nodes that the ntpd service was not running. I thought I took care of this earlier, but either way I just followed the instructions on this back in building
a virtualized 5-node HDP 2.0 cluster (all within a mac) the warnings cleared up.
Unlike the other cluster install instructions, for this setup we want all services checked on the Choose Services page and then you can take some creative liberty on the Assign Masters page. Here's a snapshot of my selections.
Here's a view of how the Assign Slaves and Clients screen looked during the installation.
The prior instructions help you clear the Hive and Oozie warnings on the Customize Services screen. The Nagios ones can be handled just as easy (i.e. just keep using "
hadoop" as the password). Also, since we only have two DataNodes it would
make sense to throttle back the HDFS > Advanced > Block replication value from 3 to 2.
Critical to this blog posting is the Misc tab here on the Customize Services screen which should be setup as the following. I basically put "
ryo" in front of all except a couple of items, but be sure to
read the notes below the screenshot before moving on.
Looking back on http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html it
seems I had some earlier misunderstandings about how this would all go. The following table identifies these and the remedial action, if any, I took to resolve each.
Item to Resolve | Action Taken |
---|---|
The Misc tab had a "Proxy group for Hive, WebHCat, Oozie and Falcon" field that I wasn't expecting | I simply left it as "users" |
The Misc tab had no place to identify the Ganglia Group of "ryonobody" that I previously created and used as the primary group for the " ryonobody" user | Knowing there is a user and a group both named "nobody" on the base OS install (and considering the bolt-on nature of Ganglia to HDP) I left the user as " nobody" |
The Misc tab had no place to identify the RRDTool's "ryorrdcahed" user that I previously created | Reading the notes again I decided (maybe realized?) this is another bolt-on service for HDP and didn't worry about the user I previously created |
The Misc tab had no place to identify the "ryoapache" user that is associated with Ganglia | Same as prior action |
The Misc tab had not place to identify the "ryopostgres" user that Ambari itself uses | No worries, but this could have been resolved during the CLI setup of Ambari as mentioned earlier |
a virtualized 5-node HDP 2.0 cluster (all within a mac), we need to make sure the ambari-server and ambari-agent services are started upon reboot. After taking care of that and restarted the virtual machines and starting all Hadoop services from Ambari,
all heck broke loose. As you do in situations like this, I worked the logs and finally got some answers.
It seems the service YARN was actually kicking off as the user "
yarn" as the box started and my user "
ryoyarn" was then failing when being started from Ambari. Medium story, short, I ended up realizing I needed to remove all the users
with the default names that Ambari told me I could cleanup later. Of course, I started with the "
yarn" user, but was being told I couldn't as it was logged in which was resolved with a bit of brute force (i.e. I ran "
sudo su - yarn",
then "
kill -9 -1", "
exit" to return to "
ryoambari", and then kicked off the "
sudo userdel yarn" which completed. I did this on all four boxes.
I then had some permission problems with the home directories on the users I created (not sure why HDP needed to do anything with these directories) that I also solved with some brute force; "
sudo chmod 777 /home/*". I also decided I'd better
go ahead and delete the other default user accounts on all boxes.
This environment surely needs to be test some more to verify all is as well as it looks like it is. Also, there were a number of brute force hacks that I did because this is a dev cluster and they each warrant revisiting if you were considering going forward
with something like this.
The real driver for this posting was to see that you could use different user & groups names for the service as well as run Ambari as another user than root. I believe I've shown that this is the case, but... I would not recommend this
be done for a production cluster.
Final Words: Don't do this! Stick with the default user and group names to prevent as many problems (those found and those yet uncovered)as possible.
相关文章推荐
- Ambari学习14_升级ambari、HDP版本(ambari 2.1升级到2.4、HDP2.3升级到2.5)
- Ambari学习2_ Ambari 2.1安装HDP2.3.2 之 六、安装部署HDP集群 详细步骤
- Ambari学习17_Amabri 2.1安装HDP2.3.2 之 七、自定义HDP服务
- HDP学习--Ambari安装Hadoop集群步骤
- Centos 7.2 安装 Ambari 2.2.2 + HDP 2.4.2 搭建Hadoop集群
- Ambari2.1安装hdp2.3
- 【转】ArcGIS API for Silverlight/WPF 2.1学习笔记(三)
- 【转】ArcGIS API for Silverlight/WPF 2.1学习笔记(四)
- Q and A: J2EE 1.4: The Gold Standard for Web Services
- Ambari 2.1安装HDP2.3.2 之 六、安装部署HDP集群 详细步骤
- Spring For Hadoop学习笔记(3)
- Ambari Install Hadoop ecosystem for 9 steps
- NumPy学习笔记 (附录: NumPy for Matlab Users)
- Ambari 2.1.1 安装hadoop生态大数据平台 HDP 2.3.4.0 本地安装源(local repo)
- hadoop学习之HDFS(2.1):linux下eclipse中配置hadoop-mapreduce开发环境并运行WordCount.java程序
- sharepoint 2016 学习系列篇(7)-如何给网站分配用户访问权限site permission for users
- hadoop 学习笔记(1)-for linux install
- hadoop 学习笔记(1)-for linux install
- 【转】ArcGIS API for Silverlight/WPF 2.1学习笔记(二)
- ArcGIS API for Silverlight/WPF 2.1学习笔记(一)——精简版