Personal tools
You are here: Home Documentation Tutorials Installing and setting up Grok under mod_wsgi using Repoze

Installing and setting up Grok under mod_wsgi using Repoze

Note: Return to tutorial view.

Repoze allows Grok to run behind any WSGI server. This tutorial will show how to install Grok behind the Apache web server and mod_wsgi using Repoze on a brand new Linux virtual server.

Introduction

What is Repoze?

To more and more Python web developers, WSGI holds the key to the Python web development future. Since there are a number of important web development frameworks and the power of Python makes it really easy to create new ones quickly, interacting with best of breed applications developed in multiple frameworks could soon be the best way to create a new Python web site.

Until relatively recently, Zope 3 and some of its derived applications, like Grok, ran the risk of missing the WSGI party, but not anymore, now that Repoze is here.

Repoze is a bridge between Zope and WSGI, which has the objectives of both helping Zope developers publish applications using WSGI and, equally important, letting non-Zope web developers use parts of Zope independently.

Why use Apache and mod_wsgi for Repoze?

There are many WSGI servers available. Why is mod_wsgi a good option?

There are a number of WSGI servers available, but this tutorial will focus on using mod_wsgi, which is a WSGI adapter module for Apache. There are a number of reasons for this.

First, Apache is the most popular web hosting platform, so there are many web developers and site administrators already familiar with it. Grok, for example, has been installed behind Apache for production servers using mod_rewrite.

Second, there are also lots of Python applications that already run under Apache using mod_python, and there are a few WSGI adapters for this module as well, but mod_wsgi is written in C code and has lower memory overhead and better performance than those adapters.

Also, one of the goals of mod_wsgi is to break into the low cost commodity web hosting market, which would be good for Python and ultimately for Grok and Zope.

Setting up a clean Linux server

When starting with a new server, it's important to get all required packages in place before beginning.

I decided to cover the whole setup from new server to Grok startup in this tutorial, to offer a complete guide for the whole process in a single place. I chose to use Linux as the operating system, again because it's by far the most popular way to deploy web applications right now. Ubuntu is my distribution of choice, but this steps apply equally well to any Debian based distribution. Other distributions use different package managers and probably other system paths, but you should be able to figure out easily what you need in any case.

I started with a clean install of Ubuntu GNU/Linux Hardy Heron 8.04 on a new virtual server. The first step is to install the necessary packages for the correct Python version (Zope currently requires Python 2.4) and also for the Apache server.

Before that, It was necessary to install the required packages for being able to compile and build software using Ubuntu (other distributions usually don't need this). Be aware that both package installation and Apache module additions usually require root access. In the commands block, the prompt with '$' is a user prompt, '#' is a root prompt you can have with sudo -s. In this part, you'll use a root terminal to not have to prefix each command by sudo. In the other parts, you'll use a user terminal where you add sudo before a command if you need to execute something as root. I usually have one terminal opened as root and another terminal as user.

$ sudo -s
# apt-get install build-essential

Next, the packages for Python and Apache. Like most packaged Linux distributions, Ubuntu requires a separate install for the development libraries of each piece of software:

# apt-get install python2.4 python2.4-dev
# apt-get install apache2

The apache2 package usually install apache2-mpm-worker, but maybe you have the other version apache2-mpm-prefork installed. To be sure which one is installed you can execute:

# dpkg -l|grep apache2

Then install the corresponding development package, apache2-threaded-dev if apache2-mpm-worker is installed, apache2-prefork-dev if apache2-mpm-prefork is installed:

# apt-get install apache2-threaded-dev

Repoze uses Python's setuptools, so that package is needed as well:

# apt-get install python-setuptools

It's possible the version provided by the Ubuntu package is not the latest. If you want to have more control of the installed version of setuptools and want to update it yourself when a new version is available, you can use the following method instead. Download manually setuptools-0.6c8-py2.4.egg (or latest version, choose py2.4) and execute the command:

# sh setuptools-0.6c8-py2.4.egg

You can later update it with sudo easy_install-2.4 -U setuptools.

Installing and configuring mod_wsgi

mod_wsgi is installed the same way as any Apache module

Now, the server is ready to install mod_wsgi. There is a package libapache2-mod-wsgi on Ubuntu 8.04, but it's the old 1.3 version. So we will build the latest mod_wsgi version. Please remove the libapache2-mod-wsgi package if you have previously installed it. We need to get the source directly from the download site and build it:

$ wget http://modwsgi.googlecode.com/files/mod_wsgi-2.1.tar.gz
$ tar xzf mod_wsgi-2.1.tar.gz
$ cd mod_wsgi-2.1
$ ./configure --with-python=/usr/bin/python2.4
$ make
$ sudo make install

Note that it is necessary to compile mod_wsgi using the same Python you will use to run your web site. Since Zope requires 2.4, the --with-python option was used to point to the newly installed Python.

Once mod_wsgi is intalled, the apache server needs to be told about it. On Apache 2, this is done by adding the load declaration and any configuration directives to the /etc/apache2/mods-available/ directory.

The load declaration for the module needs to go on a file named wsgi.load (in /etc/apache2/mods-available/ directory), which contains only this:

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

The configuration directives reside in the file named wsgi.conf next to wsgi.load. We don't create it now, but it can be useful later to add directives in it if you have more than one WSGI application to serve.

Then you have to activate the wsgi module with:

# a2enmod wsgi

Note: a2enmod stands for "apache2 enable mod", this executable create the symlink for you. Actually a2enmod wsgi is equivalent to:

# cd /etc/apache2/mods-enabled
# ln -s ../mods-available/wsgi.load
# ln -s ../mods-available/wsgi.conf # if it exists

For apache 1.3 or apache 2 with an old directory layout, you may need to put the LoadModule line and the configuration directives you will see later inside the httpd.conf file in your apache's /etc directory. The soft links above will not be necessary in that case.

Installing and configuring a Grok site under Repoze

Repoze can be installed with setup tools and a Grok site can be easily created using the repoze.grok tool included.

Repoze allows Grok to run behind any WSGI server. This tutorial will show how to install Grok behind the Apache web server and mod_wsgi using Repoze on a brand new Linux virtual server.

Once we have mod_wsgi configured, the server is finally ready for Repoze. The first step is to create a new Repoze sandbox. Since Repoze, Zope and friends require lots of packages, the idea is to get a clean environment for your project, where packages do not conflict with existing or future packages from your normal Python installation. We use setuptools to install the virtualenv package:

$ sudo easy_install-2.4 virtualenv

Creating the sandbox

Then we create the actual sandbox. Replace ${sandbox} by a directory where you would like to install the Grok sandbox:

$ virtualenv --no-site-packages ${sandbox}
$ cd ${sandbox}
$ . bin/activate

The --no-site-packages option ensures that the new Python installation does not inherit any packages from the normal installation.

Installing repoze.grok

After this is done, we finally install Grok using a specially packaged egg from the repoze.org repository. The next step installs Grok into the sandbox. Normally you should only execute the following:

$ easy_install -i http://dist.repoze.org/grok/latest/simple repoze.grok

But repoze.grok 0.1.6 ships with grok 0.11, not the latest version. So instead you get the repoze.grok in editable mode and edit the setup.py to include grok 0.13 (or the latest):

$ easy_install -b . -e -i http://dist.repoze.org/grok/latest/simple repoze.grok
$ cd repoze.grok
$ vi setup.py  # set GROK_VERSION = '0.13'
$ vi setup.cfg # comment easy_install section
$ python setup.py develop
$ cd ..

This command was run using my regular user account and created a grok directory which is actually the Repoze sandbox. The repoze.grok egg that installs this also installs a few sample configuration files, which may be used almost "as is" to run the new site.

The last thing you need to do is to create the Grok instance:

$ mkgrokinstance .

This creates all the required configuration files in etc, directories (var and log) for a Zope instance and a 'bin/grok.wsgi' script for use with mod_wsgi. The most important files are located on the etc directory of the sandbox:

  • 'grok.ini', a Paste configuration file used to establish the Paste (WSGI) pipeline which will be used to serve up repoze.grok.
  • 'zope.conf', a classic Zope 2 configuration file which can be used to adjust Zope settings.
  • 'site.zcml', a boilerplate site.zcml that should be used to control ZCML processing.
  • 'sample-users.zcml', a file that declares a user, useful for copying into users.zcml when you want to start the site.

Configure a password for the Manager, first copy the sample-users.zcml file:

$ cp etc/sample-users.zcml etc/users.zcml

and change the password in etc/users.zcml.

Now all you need to do is copy the apache2.conf file from the etc directory in your sandbox into your Apache configuration (with the appropiate edits for your site, of course). This file contains an almost identical configuration to the one shown on the repoze.org site, on the deployment page. Copy it in your /etc/apache2/sites-available/ directory:

$ sudo cp etc/apache2.conf /etc/apache2/sites-available/grok

You can give the name you want for the file, here it's called grok.

Configuration of your grok site

Final file

Here the final /etc/apache2/sites-available/grok file, after applying all modifications explained below:

WSGIPythonHome ${sandbox}
WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages
# python-eggs=/tmp/python-eggs
# please remove processes=4 if you don't use a ZEO server!

<VirtualHost *>
  ServerName my.machine.local
  WSGIScriptAlias /site ${sandbox}/bin/grok.wsgi
  WSGIProcessGroup grok
  WSGIPassAuthorization On
  WSGIReloadMechanism Process
  SetEnv HTTP_X_VHM_HOST http://my.machine.local/site
  SetEnv PASTE_CONFIG ${sandbox}/etc/grok.ini
</VirtualHost>

This will run mod_wsgi in 'daemon' mode, which means it will launch a number of processes to run the configured WSGI application instead of using the apache process. Since Repoze uses virtualenv, the site-packages directory of the virtual Python used to run it needs to be passed in the python-path variable. To tell mod_wsgi which WSGI application to run, we use the WSGIScriptAlias directive and pass it the path to the desired application.

PASTE_CONFIG variable

Please be sure to modify PASTE_CONFIG to look for grok.ini, not zope2.ini. This is an error in repoze.grok 0.1.6. Actually I think this line can be delete, it's not used in grok.wsgi...

Note on processes attribute in WSGIDaemonProcess directive

Graham Dumpleton says:

It is best not to specify 'processes=1' to WSGIDaemonProcess if you only want one process, let mod_wsgi fallback to its default of creating one process if 'processes' is not defined.

The difference is significant, because if you use the 'processes' option, whether or not it is set to '1', it will be regarded as being multiprocess in WSGI world. That is, wsgi.multiprocess is True. If you don't specify 'processes' and let default of one process apply wsgi.multiprocess will be False.

This all matters as stuff like interactive debuggers such as EvalException from Paste will not work when WSGI says it is multiprocess.

For more details see:

Note on PYTHON_EGG_CACHE directory

simplejson-1.7.1-py2.4-linux-i686.egg was installed as a zipped egg, so when the application will be started, this egg will be automatically extracted in the PYTHON_EGG_CACHE, normally "~/.python-eggs". see http://code.google.com/p/modwsgi/wiki/ApplicationIssues sections:

  • "Access Rights Of Apache User"
  • "User HOME Environment Variable"

This directory depends of the HOME environment variable. The HOME apache user www-data is /var/www. You will get the error "[Errno 13] Permission denied: '/var/www/.python-eggs'" in your error.log apache file if you don't configure the user or python-eggs variable in the WSGIDaemonProcess directive. Tip: use tail -f /var/log/apache2/error.log in another console. Example: Please replace ${sandbox} by the path where you have created your sandbox, i.e your virtualenv directory. The file included in repoze.grok contains:

WSGIDaemonProcess grok threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages

The process belongs to www-data.www-data and python-eggs cache directory will be "/var/www/.python-eggs".

You can add python-eggs variable:

WSGIDaemonProcess grok threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages python-eggs=/tmp/python-eggs

The process belongs to www-data.www-data and python-eggs cache directory will be "/tmp/python-eggs".

Or you can specify user and group variable:

WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages

The process belongs to youruser.youruser and python-eggs cache directory will be "/home/youruser/.python-eggs".

You can set both variables:

WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages python-eggs=/tmp/python-eggs

The process belongs to youruser.youruser and python-eggs cache directory will be "/tmp/python-eggs".

Be careful, var/ directory and all files in it have to be writable by the user of the process. I use the third. It allows me to test with paster serve and mod_wsgi (not at the same time of course!) without changing var/ permissions.

Reload mechanism

"WSGIReloadMechanism Process" is the default for daemon mode when running mod_wsgi 2.0c5 or later. It was previously "WSGIReloadMechanism Module". Configure it explicitly doesn't hurt. For more details of the different reload mechanism, see: http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode and in particular "Process Reloading Mechanism" section.

Configuring ZEO server

Since this configuration is intended for production use, you will also need to setup a ZEO server and configure the zope process to connect to it.

The ZEO server is already configured in etc/zeo.conf and you have the two scripts bin/zeoctl bin/runzeo generated as entry_point at the installation of ZODB3 egg.

Note: The bin/mkzeoinst . command seems to be an old thing. It will generate bin/zeoctl bin/runzeo scripts and etc/zeo.conf only if they don't exist. If you remove these files and execute the command, you will have an old zeo.conf where it search zdaemon/zdrun.py in ZODB3 egg instead of zdaemon egg. The two scripts generated execute the same thing as entry_point generated but include already the "-C etc/zeo.conf" option. So don't execute this command, you have already all needed files.

You have to configure ZEO client in the zope.conf file. Here is an example:

<zodb_db main>
  cache-size 10000
  <zeoclient>
      server localhost:8100
      storage 1
      cache-size 100MB
      name zeostorage
      var $INSTANCE/var
  </zeoclient>
  mount-point /
</zodb_db>

Normally, you have to uncomment the existing section in zope.conf file and comment the other zodb section.

Launch the server

To run Grok behind mod_wsgi, you have to start the ZEO and restart Apache. First start the the ZEO server (before reloading apache2):

$ bin/zeoctl -C etc/zeo.conf start
. daemon process started, pid=7538

But zeo dies and I don't know why. log/zeo.log gives me:

2008-07-28T15:32:06 INFO root daemon manager started
2008-07-28T15:32:06 INFO root spawned process pid=7538
2008-07-28T15:32:06 INFO root sleep 1 to avoid rapid restarts
2008-07-28T15:32:06 INFO root pid 7538: exit status 2; exiting now

Actually only this command works:

$ bin/runzeo -C etc/zeo.conf

You may prefer in this case to deactivate the ZEO server by reverting the change in zope.conf and removing processes attribute in the WSGIDaemonProcess directive.

The first time, enable your site and reload apache2:

$ sudo a2ensite grok
$ sudo /etc/init.d/apache2 reload

When you visit the site in a browser (http://localhost/site), you should see the Grok Admin UI. You should be able to log in using the admin login name and password (found in 'etc/users.zcml').

To stop your site:

$ sudo a2dissite grok
$ sudo /etc/init.d/apache2 reload

To reload your site:

$ touch bin/grok.wsgi

Note on restarting apache2

I suppose that you have other applications configured with apache, so we don't want to stop currently opened connections. If you want currently open connections not to be aborted, don't use apache2ctl restart or apache2ctrl stop. Use instead apache2ctl graceful and apache2ctl graceful stop respectively. See man apache2ctl for more details. Note on Ubuntu Hardy Heron, you can use this too:

# /etc/init.d/apache2 reload
# /etc/init.d/apache2 stop

Currently open connections are not aborted.

Serve a grok project

You may create a new grok project by invoking:

$ easy_install -U grokproject # to be sure we have the latest version (here 0.8)
$ grokproject --run-buildout=no helloworld
$ cd helloworld
$ python2.4 setup.py develop

Be careful, your virtual environment have to be activated to use the good python. The helloworld/buildout.cfg and helloworld/bootstrap.py files will not be used.

Then create a ZCML slug, a file ${sandbox}/etc/grok-apps/helloworld-configure.zcml with only one line:

<include package="helloworld" />

You have to reload your site by touching grok.wsgi. You Grok application is ready to run under mod_wsgi, and perhaps be combined with other Python WSGI applications on the same site.

To use an existing grok project, enter in your project, develop the egg and create a ZCML slug like above.

Troubleshooting

LockError: Couldn't lock '${sandbox}/var/Data.fs.lock'

If you have this error in your error.log apache file, it means you have more than one process trying to Lock your Data.fs. Two solutions:

  • configure zope.conf to use ZEO server as the main storage
  • or remove the processes attribute from the WSGIDaemonProcess directive and deactivate ZEO in zope.conf.