Monday, March 29, 2010

More fun with Chef in the Cloud

I've been working with Chef on the RightScale platform for about six months now. I've written several cookbooks to stand up self-service applications for internal use at my company. I and my colleague think we know enough about how it works now to do it right.

When we started out, RightScale hadn't yet released their chef infrastruture. So I started out by installing two chef servers: one in the public EC2 and one in our VPC. I used these to start writing and testing my cookbooks so that they'd be ready once RightScale released.

Once RightScale did release, I shutdown my chef servers and forgot about them. But recently, we've been talking to Adam Jacob and his crew at Opscode and we want to see what a hybrid RightScale / Opscode environment would look like. That means going back to owning our own chef servers (and possibly later replacing them with the Opscode Platform).

But Chef is moving pretty quickly. They've made a major release since I last had chef servers running in which they've changed the authentication system dramatically. Sadly, the CentOS RPM at ELFF hasn't caught up, so to take advantages of the new 0.8.x features, I had to install it myself.

It hasn't been easy, so I'll post what I've done here in case someone else has the same problems I did. I'm using a custom AMI of CentOS 5.3 on Amazon's EC2 in our VPC, launched with RightScale

First off, I followed the directions at the Chef Wiki for preparing a CentOS host to be come a chef server. Next I followed the directions for bootstrapping a server. The things that tripped me up were
  • RightScale runs their own AMQP service on their instances to manage communications with their infrastructure. You'll have to remove it, turn it off, or change the chef RabbitMQ port to something else. I just turned it off for this experiment. I don't know the implications of changing the chef RabbitMQ port.
  • Don't forget the final bit at the bottom of the first page that describes setting up the chef user, setting permissions on run diretories and starting the services for CentOS.
  • Be sure to use "init_style": "init" in your json file when bootstrapping.
  • Be sure to modify your init scripts to add -P $pidfile as in CHEF-1074 so service stop/restart work properly.
Even after this I still had problems. Because I had re-run chef-solo a number of times, the default admin password in /etc/chef/server.rb had changed (though the one in the db had not). Lucky for me, chef backs up templated files so I was able to recover it. Once I had changed the admin account password, I got a 500 error from merb: "named route not found: new_nodes".

Google found this IRC chat log (search for "BobFunk" and read on) which suggested I downgrade merb from 1.1.0 to 1.0.15 and that fixed the problem. Here are the commands to do that

gem install merb-core merb-assets merb-haml merb-helpers merb-param-protection merb-slices -v=1.0.15
gem uninstall -I merb-core merb-assets merb-haml merb-helpers merb-param-protection merb-slices -v=1.1.0

So now I seem to have a working chef server once again. I just need to hide it behind an Apache proxy so I can serve it over SSL on port 443 (though I guess I'll need two secure ports, one for the webui and one for the api). Next I'll need to actually install my cookbooks and test them out on some new clients.

1 comment:

Jeroen said...

Hi Chris,

Thanks man, this really helped prevent some heavy-duty Googling. Had gotten to the IRC (and the mailtrail in the mailinglist) but this was key to the workaround/solution.

Kind regards,
Jeroen.