Personal tools
You are here: Home Documentation Tutorials Installing and setting up Grok under mod_wsgi

Installing and setting up Grok under mod_wsgi

Note: Return to tutorial view.

Grok can run behind any WSGI server. This tutorial will show how to install Grok behind the Apache web server and mod_wsgi on a brand new Linux virtual server.

Introduction

Grok and WSGI

To more and more Python web developers, WSGI holds the key to the Python web development future. Since there are a number of important web development frameworks and the power of Python makes it really easy to create new ones quickly, interacting with best of breed applications developed in multiple frameworks could soon be the best way to create a new Python web site.

Until relatively recently, Zope 3 and some of its derived applications, like Grok, ran the risk of missing the WSGI party, but not anymore. Grok 1.1 is WSGI compatible and can therefore be integrated with the wide range of WSGI-based technologies available in the Python world today.

Why use Apache and mod_wsgi for Grok?

There are many WSGI servers available. Why is mod_wsgi a good option?

There are a number of WSGI servers available, but this tutorial will focus on using mod_wsgi, which is a WSGI adapter module for Apache. There are a number of reasons for this.

First, Apache is the most popular web hosting platform, so there are many web developers and site administrators already familiar with it. Grok, for example, has been installed behind Apache for production servers using mod_rewrite.

Second, there are also lots of Python applications that already run under Apache using mod_python, and there are a few WSGI adapters for this module as well, but mod_wsgi is written in C code and has lower memory overhead and better performance than those adapters.

Also, one of the goals of mod_wsgi is to break into the low cost commodity web hosting market, which would be good for Python and ultimately for Grok and Zope.

Setting up a clean Linux server

When starting with a new server, it's important to get all required packages in place before beginning.

I decided to cover the whole setup from new server to Grok startup in this tutorial, to offer a complete guide for the whole process in a single place. I chose to use Linux as the operating system, again because it's by far the most popular way to deploy web applications right now. Ubuntu is my distribution of choice, but this steps apply equally well to any Debian based distribution. Other distributions use different package managers and probably other system paths, but you should be able to figure out easily what you need in any case.

You can start with a clean install of a recent version of Ubuntu GNU/Linux. The first step is to install the necessary packages for the correct Python version (Grok 1.1 requires Python 2.6/2.5) and also for the Apache server.

Before that, It is necessary to install the required packages for being able to compile and build software using Ubuntu (other distributions usually don't need this). Be aware that both package installation and Apache module additions usually require root access. In the commands block, the prompt with '$' is a user prompt, '#' is a root prompt you can have with sudo -s. In this part, you'll use a root terminal to not have to prefix each command by sudo. In the other parts, you'll use a user terminal where you add sudo before a command if you need to execute something as root. You can have one terminal opened as root and another terminal as user.

$ sudo -s
# apt-get install build-essential

Next, the packages for Python and Apache. Like most packaged Linux distributions, Ubuntu requires a separate install for the development libraries of each piece of software:

# apt-get install python2.5 python2.5-dev
# apt-get install apache2

The apache2 package usually install apache2-mpm-worker, but maybe you have the other version, apache2-mpm-prefork, installed. To be sure which one is installed you can execute:

# dpkg -l|grep apache2

Then install the corresponding development package, apache2-threaded-dev if apache2-mpm-worker is installed, apache2-prefork-dev if apache2-mpm-prefork is installed:

# apt-get install apache2-threaded-dev

Grok uses Python's setuptools, so that package is needed as well:

# apt-get install python-setuptools

It's possible the version provided by the Ubuntu package is not the latest. If you want to have more control of the installed version of setuptools and want to update it yourself when a new version is available, you can use the following method instead. Download manually setuptools-0.6c9-py2.5.egg (or latest version, choose py2.5) and execute the command:

# sh setuptools-0.6c9-py2.5.egg

You can later update it with sudo easy_install-2.4 -U setuptools.

Installing and configuring mod_wsgi

mod_wsgi is installed the same way as any Apache module

Now, the server is ready to install mod_wsgi. There is a package libapache2-mod-wsgi on Ubuntu, but it's recommended to build the latest version, in part because mod_wsgi has to be compiled with the same Python used by Grok. Please remove the libapache2-mod-wsgi package if you have previously installed it. We need to get the source directly from the download site and build it:

$ wget http://modwsgi.googlecode.com/files/mod_wsgi-2.6.tar.gz
$ tar xzf mod_wsgi-2.6.tar.gz
$ cd mod_wsgi-2.6
$ ./configure --with-python=/usr/bin/python2.5
$ make
$ sudo make install

Again, note that it is necessary to compile mod_wsgi using the same Python you will use to run your web site. Since Grok requires 2.5, the --with-python option was used to point to the version of Python we need.

Once mod_wsgi is intalled, the apache server needs to be told about it. On Apache 2, this is done by adding the load declaration and any configuration directives to the /etc/apache2/mods-available/ directory.

The load declaration for the module needs to go on a file named wsgi.load (in /etc/apache2/mods-available/ directory), which contains only this:

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

The configuration directives reside in the file named wsgi.conf next to wsgi.load. We don't create it now, but it can be useful later to add directives in it if you have more than one WSGI application to serve.

Then you have to activate the wsgi module with:

# a2enmod wsgi

Note: a2enmod stands for "apache2 enable mod", this executable create the symlink for you. Actually a2enmod wsgi is equivalent to:

# cd /etc/apache2/mods-enabled
# ln -s ../mods-available/wsgi.load
# ln -s ../mods-available/wsgi.conf # if it exists

For apache 1.3 or apache 2 with an old directory layout, you may need to put the LoadModule line and the configuration directives you will see later inside the httpd.conf file in your apache's /etc directory. The soft links above will not be necessary in that case.

Configuring a Grok site under mod_wsgi

Grok can be installed with setup tools and a Grok site can be easily created using the included grokproject tool.

As mentioned before, Grok can run behind any WSGI server, not just the paster server used by default. Now that we have a working mode_wsgi, we will show how to run Grok behind it.

We assume that you may already have a working application that you want to integrate with mod_wsgi. If that's not the case, please create one following the installation tutorial that you can find elsewhere on this site. For the purposes of this example, we'll assume that a Grok application named 'hello' was just created and that we want to serve it from behind Apache using mod_wsgi.

Getting the Grok application ready for mod_wsgi

WSGI applications use 'entry points' to let the WSGI server how to run the program. The entry point is usually a simple Python script that provides a function for calling the application and passing in the appropriate initialization file to the server. Some servers, like the paster server, need just the path to the .ini file, which is what we normally use to start up a Grok application. That doesn't mean there's no entry point script for paster. The entry point in fact is defined in the setup.py file that is created by grokproject when a new project is initialized. Take a look at the last few lines of the file:

[paste.app_factory]
main = grokcore.startup:application_factory
debug = grokcore.startup:debug_application_factory

The heading paste.app_factory tells the server where to find the factory functions for each section of the .ini file. In Grok, a general application factory function is defined in the grokcore.startup package, which is what paster uses to start applications.

However, mod_wsgi requires a path to the factory, which would be cumbersome to include in our configuration, because that would mean that it would need to point to a file inside the grokcore.startup egg, and since eggs include the version number, a simple update could crash our site if the old egg is removed. It would be better to have our own factory defined inside the application package.

Given that the factory code can be almost identical from project to project, it would be good to have it included automatically when we create the project, to avoid having to recreate the same script every time. Fortunately for us, Grok's use of buildout turns out to be very helpful in this case, since there is a buildout recipe available that creates the WSGI application factory for us.

The recipe is called collective.recipe.modwsgi. To use it, simply add a part to the buildout named, for example, wsgi_app. The recipe requires two parameters, the first being the eggs that have to be made available to the Python process that will run the app under WSGI and the second the path to the configuration file that will be used for the site. This last parameter value is the usual parts/etc/deploy.ini path we have been using for running the application under paster. That's it. Edit the buildout.cfg file parts list to look like this:

parts =
    eggbasket
    app
    i18n
    test
    mkdirs
    zpasswd
    zope_conf
    site_zcml
    zdaemon_conf
    wsgi_app
    deploy_ini
    debug_ini

Next, add the following section anywhere on the file:

[wsgi_app]
recipe = collective.recipe.modwsgi
eggs = ${app:eggs}
config-file = ${buildout:directory}/parts/etc/deploy.ini

There. Note that the eggs parameter simply points back to the main egg section defined at the start of the buildout, to avoid repetition.

When the buildout is run again, we'll find a parts/wsgi_app directory (or whichever name we used for the buildout part. Inside that directory, there will be a wsgi file that can be used as is by mod_wsgi to run the application.

Configuring an Apache site to use mod_wsgi

The last step is to add a site to the Apache server that uses mod_wsgi to serve our application. This is standard mod_wsgi configuration, we'll just add the path to the application factory that we created in the previous section.

To set up the virtual host, create a file in the /etc/apache2/sites-available directory and call it, for example, grok. Put the following in it, assuming your Grok application is at /home/cguardia/grok/hello:

WSGIPythonHome /usr
WSGIDaemonProcess grok user=cguardia group=cguardia threads=4 maximum-requests=10000

<VirtualHost *:80>
  ServerName wsgi.example.com
  WSGIScriptAlias /hello /home/cguardia/grok/hello/parts/wsgi_app/wsgi
  WSGIProcessGroup grok
  WSGIPassAuthorization On
  WSGIReloadMechanism Process
  SetEnv HTTP_X_VHM_HOST http://wsgi.example.com/hello
</VirtualHost>

This will run mod_wsgi in 'daemon' mode, which means it will launch a number of processes to run the configured WSGI application instead of using the apache process. If you are using virtualenv, the site-packages directory of the virtual Python used to run it needs to be passed in the WSGIPythonHome variable. To tell mod_wsgi which WSGI application to run, we use the WSGIScriptAlias directive and pass it the path to application factory that we created earlier.

Note that we assign a user and group to run the process. It is required that this user has access to the application directory.

The PYTHON_EGG_CACHE directory

Note that when the application is started, all eggs will be automatically extracted in the PYTHON_EGG_CACHE directory, normally "~/.python-eggs". This directory depends of the HOME environment variable. The HOME apache user www-data is /var/www. You may get the error "[Errno 13] Permission denied: '/var/www/.python-eggs'" in your error.log apache file if you don't configure the user or python-eggs variable in the WSGIDaemonProcess directive. You can also add a python-eggs parameter to tell mod_wsgi to use an alternative directory for the egg cache:

WSGIDaemonProcess grok threads=4 maximum-requests=10000 python-eggs=/tmp/python-eggs

In this example, the process belongs to www-data.www-data and python-eggs cache directory will be "/tmp/python-eggs".

Running the application

Once the configuration is ready, we need to enable the site in Apache, since we just created it. This is only necessary the first time we run it:

$ sudo a2ensite grok

Then, we can start serving our application from Apache, simply by reloading the configuration for the server:

$ sudo /etc/init.d/apache2 reload

When you visit the site in a browser (http://wsgi.example.com/hello), you should see the Grok Admin UI. You should be able to log in using the admin login name and password (found in 'etc/users.zcml').

Adding a ZEO server

By default, mod_wsgi will use a single process to run the application. Since this configuration is intended for production use, it may be desirable to have a higher number of processes available to serve the application. The ZODB that Grok uses comes with a server named ZEO (Zope Enterprise Objects) that allows us to add as many processes to our configuration as our system permits, providing unlimited horizontal scalability. Typically, the recommended number of processes is one for each core in the system's processors. Let's set up a ZEO server and configure the Grok process to connect to it.

Once again, the easiest way to get ZEO running is to use an existing buildout recipe. This time we'll use one named zc:zodbrecipes. Add a zeo_server part to your buildout.cfg file, like this:

parts =
    eggbasket
    app
    i18n
    test
    mkdirs
    zpasswd
    zope_conf
    site_zcml
    zdaemon_conf
    zeo_server
    wsgi_app
    deploy_ini
    debug_ini

Next, add a zeo_server section like the following:

[zeo_server]
recipe = zc.zodbrecipes:server
zeo.conf =
    <zeo>
      address 8100
    </zeo>
    <blobstorage 1>
         blob-dir ${buildout:directory}/var/blobstorage
      <filestorage 1>
        path ${buildout:directory}/var/filestorage/Data.fs
      </filestorage>
    </blobstorage>
    <eventlog>
      level info
      <logfile>
        path ${buildout:directory}/parts/log/zeo.log
      </logfile>
    </eventlog>

This will add the ZEO server and configure it to listen on port 8100. The rest of the configuration is pretty much boilerplate, so just copy it to new projects when you need ZEO there.

Next, we need that the buildout add scripts for starting and stopping ZEO. This is easily accomplished by adding the ZODB3 egg to our app section:

[app]
recipe = zc.recipe.egg
eggs = gwsgi
   z3c.evalexception>=2.0
   Paste
   PasteScript
   PasteDeploy
   ZODB3

Configuring the ZEO client

Currently, the Grok application that we are using is working with the regular Zope server. To use ZEO, we need to change the configuration to connect to the server at port 8100. Fortunately, the required changes already come inside the regular zope.conf file that is created inside the grok project, so we only need to uncomment those lines. Uncomment the following lines inside the zope.conf.in file in the etc directory of your Grok project:

# Uncomment this if you want to connect to a ZEO server instead:
  <zeoclient>
    server localhost:8100
    storage 1
    # ZEO client cache, in bytes
    cache-size 20MB
    # Uncomment to have a persistent disk cache
    #client zeo1
  </zeoclient>

The important line in there is the one with the ZEO server address, in this case the same host as the Grok application and the port we defined in our ZEO configuration in the previous section.

Launching the ZEO server

After running the buildout again, we'll be ready to start the ZEO server in the background. To do that, we only have to run the server script that was automatically created for us. The name of the script is the same as the name of the part in the buildout where we configured the server:

$ bin/zeo_server start

Our Grok application is running. To stop it:

$ bin/zeo_server stop

Augmenting the number of processes

Recall that we mentioned earlier that mod_wsgi runs the application in a single process by default. To really take advantage of ZEO, we want to have more processes available. We need to make a small addition to our mod_wsgi Apache virtual host configuration for that. Change the WSGIDaemonProcess line near the top to look like this:

WSGIDaemonProcess grok user=cguardia group=cguardia processes=2 threads=4 maximum-requests=10000

In this example, we'll have two processes running, with four threads each. Using ZEO and mod_wsgi, we now have an escalable site.