您的位置:首页 > 运维架构

Ambari学习9_setting up hdp 2.1 with non-standard users for hadoop services

2016-11-10 12:40 676 查看


setting up hdp 2.1 with non-standard users for hadoop services (why not use
a non-standard user for ambari, too)

Skip
to end of metadata

Created by Lester Martin, last modified on Jan
08, 2015

Go
to start of metadata

If you find yourself needing to setup Hortonworks Data Platform (HDP) with Ambari in an environment that users and groups need to be pre-provisioned instead of simply created during the install process, then don't fret as Ambari has got you covered. This write-up
piggybacks the HDP Documentation site and uses HDP
2.1.2 along with Ambari 1.5.1 as a baseline to build against. It
will also build a 4-node cluster (2 worker nodes, 1 master node, and 1 node to run Knox on) all running on CentOS 6.5

As this whole post is about using non-standard users and groups, let's refer to http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html which
lists the default user and group names. For our purposes, let's just go and and prefix all of these with "
ryo
" (example: user account "
mapred
" for the MapReduce2 service will now be "
ryomapred
"). Also, we'll want to
install, and execute, Ambari as a user other than root. Using this same general theme, we'll use "
ryoambari
" for the user and group name.

To get started, follow the general flow of building a virtualized 5-node HDP 2.0 cluster (all within a mac) to
set up the servers up until you get to the section called Install Cluster via Ambari. And yes, there are faster ways to get to this point such as described in https://blog.codecentric.de/en/2014/04/hadoop-cluster-automation/ using
tools such as Vagrant, but there is value to walking these steps for those shoring up their linux admin skills.
Back to those more lengthy instructions, start out building the First Master Node as described previously, but remember that we want to build CentOS 6.5 boxes (grabbed my ISO here).
For this host name, call the VirtualBox entry
4N-HDP212-M1
to differentiate from the older 2.0 cluster. It's OK to reuse the
m1.hdp2
hostname.

Work through the same network setup instructions for this first node, but pause for a second when you get to the SSH Setup section. The instructions athttp://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html are
to make sure that Ambari can make password-less SSH connections and are based on the default approach of letting Ambari run as
root
. As we'll be using "
ryoambari
" (mentioned above) we need to set this up when we are that user. To
do that, we'll need the user in the first place. We can create the user (which creates a group with same name) and set the password to "
hadoop
" for simplicity's sake.

A note is identified in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html that
states "All new service user accounts, and any existing user accounts used as service users, must have a UID >= 1000." Unfortunately, as described here,
CentOS & RHEL begin their numbering at 500.

That's probably not a big deal for the account we are going to use for Ambari itself, but checking it out & making corrective actions on this single account will allow us later to breeze through this concern in the verification/modification step.

I simply added 1000 to the number. This will come in handy later as the UIDs for users should be greater than this value.

A note is identified in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html that
(in regards to installing/running Ambari) states, "It is possible to use a non-root SSH account, if that account can execute
sudo
without entering a password."

On that note there are great write-ups out there like found here,
but I cheated since this is a dev-only setup (virtualized even within my Mac) and every "real" environment will have a sysadmin who knows how to do this best for their setup. I followed this
thread and just did the following to grant everyone the ability to do password-less
sudo
commands.

Yep, that worked! I quickly backed out of that edit, but I got in without getting prompted for a password. Again, this is surely NOT the way to do this on a production environment!

Now we can circle back and work on the SSH instructions found at http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html,
but do them with
ryoambari
instead of
root
as shown below (i.e. replaying the instructions back in building
a virtualized 5-node HDP 2.0 cluster (all within a mac)).

Now for an important step of this deviation from a standard install. As http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html calls
out, we are going to create users & groups ahead of time. The following commands will create the three groups that we need and, even though there is no special warning about their IDs being >= 1000, we'll do just that.

These IDs can be verified as shown in the abbreviated output below.

Now we need to run the following commands to get all the users created and put in their appropriate groups.

Knowing that the
ryoambari
UID was 1500 suggested these new users would have IDs >= 1000 as well and can easily be verified.

Now you can return to the notes in building a virtualized 5-node HDP 2.0 cluster (all within a mac) ( i.e.
keep on chugging through the install docs) after theSSH Setup section and work all the way through Rinse, Lather, and Repeat... as long as you take into account the change to the hosts we are trying to build and set them up
as identified below. We can call the newly created VirtualBox appliance 4N-HDP212-template (with file name of 4N-HDP212-template.ova).

VirtualBox Name

OS Hostname

IP Address

4N-HDP212-K1k1.hdp2192.168.56.31
4N-HDP212-M1m1.hdp2192.168.56.41
4N-HDP2-W1w1.hdp2192.168.56.51
4N-HDP2-W2w2.hdp2192.168.56.52
When you start each VM up for the first time, log in as
ryoambari
to make all the identified changes to make the box unique (prefixing everything with
sudo
of course).

As I was working through the Ensure Connectivity Between Nodes section of the other post I ran into the problem below and thanks to some quick googling I found a quick fix at http://www.omnicode.com/qa.php?id=69 that
did the trick. Unfortunately, I had to do this on each box.

Phew... that's done! Now we (finally) get to start installing Ambari. For the most part, just follow the Install Cluster via Ambari ramblings back inbuilding
a virtualized 5-node HDP 2.0 cluster (all within a mac). Of course, this is HDP 2.1, not 2.0, but very similar. I'll call out the major differences below.

Just to make my install a bit more interesting, I decided to go down the optional step identified at http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-6.html since
most datacenter-based clusters will not have internet access. Thankfully, I already did all this setup on my
m1.hdp2
master node where I will be installing Ambari in an earlier post; using
your mac to install a virtualized hadoop cluster? (then setup a local repo on it). Feel free to just skip that if your fingers are already getting tired on this exercise.

It looks like during the "
sudo ambari-server setup
" exercise discussed in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap2-2.html I
lost my chance to use the "
ryopostgres
" user I created earlier by not choosing to enter the advanced database configuration. This one is OK as most enterprises will already have a centralized database they want to use anyway and the setup for
that will trump this concern.

If you went down the local repo option like I did, then when you get to http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-2a_2x.html make
sure your screen looks something like the following (i.e. use the URL you created in the prior posting).



There is a subtle callout on http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-3_2x.html for
running Ambari as another user besides
root
and the screenshot below shows what that looks like when declared to run as
ryoambari
.



Unfortunately, the Confirm Hosts page showed failures for all nodes with the following messages.

The good news is that message was referenced in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap3-4_2x.html and
following the information provided I found out that I needed a newer version of OpenSSL which was accomplished by running "
sudo yum upgrade openssl
" on all the boxes to get past these errors.

I also found out that I had a warning from each of the four nodes that the ntpd service was not running. I thought I took care of this earlier, but either way I just followed the instructions on this back in building
a virtualized 5-node HDP 2.0 cluster (all within a mac) the warnings cleared up.

Unlike the other cluster install instructions, for this setup we want all services checked on the Choose Services page and then you can take some creative liberty on the Assign Masters page. Here's a snapshot of my selections.



Here's a view of how the Assign Slaves and Clients screen looked during the installation.



The prior instructions help you clear the Hive and Oozie warnings on the Customize Services screen. The Nagios ones can be handled just as easy (i.e. just keep using "
hadoop
" as the password). Also, since we only have two DataNodes it would
make sense to throttle back the HDFS > Advanced > Block replication value from 3 to 2.

Critical to this blog posting is the Misc tab here on the Customize Services screen which should be setup as the following. I basically put "
ryo
" in front of all except a couple of items, but be sure to
read the notes below the screenshot before moving on
.



Looking back on http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html it
seems I had some earlier misunderstandings about how this would all go. The following table identifies these and the remedial action, if any, I took to resolve each.

Item to Resolve
Action Taken
The Misc tab had a "Proxy group for Hive, WebHCat, Oozie and Falcon" field that I wasn't expectingI simply left it as "
users
"
The Misc tab had no place to identify the Ganglia Group of "
ryonobody
" that I previously created and used as the primary group for the "
ryonobody
" user
Knowing there is a user and a group both named "
nobody
" on the base OS install (and considering the bolt-on nature of Ganglia to HDP) I left the user as "
nobody
"
The Misc tab had no place to identify the RRDTool's "
ryorrdcahed
" user that I previously created
Reading the notes again I decided (maybe realized?) this is another bolt-on service for HDP and didn't worry about the user I previously created
The Misc tab had no place to identify the "
ryoapache
" user that is associated with Ganglia
Same as prior action
The Misc tab had not place to identify the "
ryopostgres
" user that Ambari itself uses
No worries, but this could have been resolved during the CLI setup of Ambari as mentioned earlier
Thankfully... the Install, Start and Test screen clear with all green bars. Now, like shown at the end of building
a virtualized 5-node HDP 2.0 cluster (all within a mac), we need to make sure the ambari-server and ambari-agent services are started upon reboot. After taking care of that and restarted the virtual machines and starting all Hadoop services from Ambari,
all heck broke loose. As you do in situations like this, I worked the logs and finally got some answers.

It seems the service YARN was actually kicking off as the user "
yarn
" as the box started and my user "
ryoyarn
" was then failing when being started from Ambari. Medium story, short, I ended up realizing I needed to remove all the users
with the default names that Ambari told me I could cleanup later. Of course, I started with the "
yarn
" user, but was being told I couldn't as it was logged in which was resolved with a bit of brute force (i.e. I ran "
sudo su - yarn
",
then "
kill -9 -1
", "
exit
" to return to "
ryoambari
", and then kicked off the "
sudo userdel yarn
" which completed. I did this on all four boxes.

I then had some permission problems with the home directories on the users I created (not sure why HDP needed to do anything with these directories) that I also solved with some brute force; "
sudo chmod 777 /home/*
". I also decided I'd better
go ahead and delete the other default user accounts on all boxes.

I was able to stop and start all service to green bar (although I did it service by service) after these modifications and I also restarted the OS on all the boxes to make sure I could get back to this "green" place.



This environment surely needs to be test some more to verify all is as well as it looks like it is. Also, there were a number of brute force hacks that I did because this is a dev cluster and they each warrant revisiting if you were considering going forward
with something like this.

The real driver for this posting was to see that you could use different user & groups names for the service as well as run Ambari as another user than root. I believe I've shown that this is the case, but... I would not recommend this
be done for a production cluster.

Final Words: Don't do this! Stick with the default user and group names to prevent as many problems (those found and those yet uncovered)as possible.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: