Upgrading Discourse Servers on EC2
For the last few months, we’ve been moving our online support forums from a ton of independent sites to a single discourse server with multiple categories for each product. We hope this will let us better communicate with our customers by giving them a single location where a their questions can be rapidly answered by a large audience of both McNeel employees and other expert users.
Unfortunately (for us), discourse is not yet a simple turn-key online service that we can simply turn on and let whomever we are paying make sure things continue to run smoothly and all of the parts are up to date. Discourse is constantly being worked on with new features or bug fixes based on the feedback users of our site and other discourse sites are providing the discourse team. It is exciting to see these changes show up every week, but in order to make sure we are constantly providing service to our users we need to come up with a few processes.
This post describes our process for upgrading our server to use the latest code from discourse.
First, I should note that we use Amazon EC2 for our server and that we are currently using an instance size of m1.medium which appears to be running pretty smooth for the current load of users on our site. The server was set up using the configuration defined here
- Figure out if you need to update It is a good idea to look at the version number on the admin dashboard every once in a while. If you notice that the running version is old, it’s time to update.
- Log on to the Amazon EC2 management console
- Bring up “Under Construction” Server
We have a small web server that only displays the this image for any request. This server is named “Discourse Under Construction” in EC2. Find this server by going to the Instances tab and make sure it is running. If it isn’t running, right click on the server and click on the “start” option.
One the server is running, click on the “Elastic IPs” tab and find the entry for “54.277.248.2”. This is where discourse.mcneel.com requests go to. Right click on that entry and select “associate” and then select “Discourse Under Construction”.
Try reloading discourse.mcneel.com until you see the under construction page.
- Clone the current production server
Go to the “Instances” tab on EC2 and find active “discourse” server. Right click on it stop it. Once the server is stopped, right click on it again and select “Create an Image (EBS AMI)”. This will save an image that of the server in it’s current state so we can launch a new instance with this state. I stopped the server first for paranoid reasons to make sure nothing is in a transient state that doesn’t make sense when saving. This may not be necessary, but I don’t know enough about the inner workings of EC2 to make that judgement.Once the image has been created, go to the AMIs tab, right click on the image that you just created and select launch to bring up a clone of the production server.
- set instance type to M1.Medium
- name the instance based on the version you plan to update to. Use the version number of the latest discourse to name the instance. For example, the image in step one shows version 0.9.6.1 as the latest version available. In this case, name the instance “discourse_0_9_6_1”
- for keypairs and security groups, use the discourse ones
- Update the clone
Once the new server has been launched, test it by going to the public url for the server that Amazon allocated. You should see the discourse site with the latest content and you won’t be logged in (because this is a new server.)
Log onto the server by right clicking on the instance and select “Connect” (make sure user name is ubuntu and use our discourse private key.)These instructions are based on the updating discourse instructions with a few tweaks
sudo su - discourse bluepill stop bluepill quit cd discourse git checkout master git pull git fetch --tags #git checkout latest-release #where latest-release is the most recent tag. I usually just look at #https://github.com/discourse/discourse/releases and use the latest #version tag. At the time of this writing, I would enter git checkout v0.9.6.1 #Check sample configuration files for new settings #see the section on this on the "updating discourse" page #This is something I need to learn more about since I don't know quite #what to do when new changes appear # this could take some time. get a cup of coffee bundle install --without test --deployment # this is generally fast RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ENV=production rake db:migrate # EDIT: since we are using multisite configuration now, need to perform the following RAILS_ENV=production bundle exec rake multisite:migrate # this could take some time; try walking off that cup of coffee RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ENV=production rake assets:precompile #make the hostname discourse.mcneel.com exit sudo pico /etc/hostname #edit the file so the content is discourse.mcneel.com #also change the hostname without rebooting sudo hostname discourse.mcneel.com sudo su - discourse #COPY THE FULL LINE, NOTICE THE SCROLLBAR IN THIS WINDOW? RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ROOT=~/discourse RAILS_ENV=production NUM_WEBS=2 /home/discourse/.rvm/bin/bootup_bluepill --no-privileged -c ~/.bluepill load ~/discourse/config/discourse.pill exit exit #shut down the shell application
Test the site by
- visit the site in your browser
- log in
- look at the dashboard on the admin page
- if everything is working, you’ll get a message that says everything is up to date
Reboot the clone using the EC2 console and make sure it is still working after a reboot
- Make the clone the new production server
Click on the “Elastic IPs” tab and find the entry for “54.277.248.2”. Make the freshly updated server what this IP is associated with
- Turn off the “under construction” server no need to pay for this server when it isn’t needed
I realize this is a bit overkill and plan on updating the process once I’ve gained a bit more confidence in it. We should be able to
- Create the AMI
- treat the AMI as the backup
- upgrade the running instance without bringing up a clone
This should reduce the “out of service” time, but baby steps first:)