Repoze allows Grok to run behind any WSGI server. This tutorial will show how to install Grok behind the Apache web server and mod_wsgi using Repoze on a brand new Linux virtual server.
To more and more Python web developers, WSGI
holds the key to the Python web development future. Since there are a
number of important web development frameworks and the power of Python
makes it really easy to create new ones quickly, interacting with best
of breed applications developed in multiple frameworks could soon be
the best way to create a new Python web site.
Until relatively recently, Zope 3 and some of its derived applications,
like Grok, ran the risk of missing the WSGI party, but not anymore,
now that Repoze is here.
Repoze is a bridge between Zope and WSGI, which has the objectives
of both helping Zope developers publish applications using WSGI and,
equally important, letting non-Zope web developers use parts of Zope
independently.
There are many WSGI servers available. Why is mod_wsgi a good option?
There are a number of WSGI servers available, but this tutorial will focus on using mod_wsgi, which is a WSGI adapter module for Apache. There are a number of reasons for this.
First, Apache is the most popular web hosting platform, so there are many web developers and site administrators already familiar with it. Grok, for example, has been installed behind Apache for
production servers using mod_rewrite.
Second, there are also lots of Python applications that already run
under Apache using mod_python, and there are a few WSGI adapters for
this module as well, but mod_wsgi is written in C code and has lower
memory overhead and better performance than those adapters.
Also, one of the goals of mod_wsgi is to break into the low cost
commodity web hosting market, which would be good for Python and
ultimately for Grok and Zope.
When starting with a new server, it's important to get all required packages in place before beginning.
I decided to cover the whole setup from new server to Grok startup
in this tutorial, to offer a complete guide for the whole process in a
single place. I chose to use Linux as the operating system, again
because it's by far the most popular way to deploy web applications
right now. Ubuntu is my distribution of choice, but this steps apply
equally well to any Debian based distribution. Other distributions use
different package managers and probably other system paths, but you
should be able to figure out easily what you need in any case.
I started with a clean install of Ubuntu GNU/Linux Hardy Heron 8.04 on a new virtual
server. The first step is to install the necessary packages for the
correct Python version (Zope currently requires Python 2.4) and also
for the Apache server.
Before that, It was necessary to install the required packages for
being able to compile and build software using Ubuntu (other distributions
usually don't need this). Be aware that both package installation and Apache
module additions usually require root access.
In the commands block, the prompt with '$' is a user prompt,
'#' is a root prompt you can have with sudo -s.
In this part, you'll use a root terminal to not have to prefix each command by sudo.
In the other parts, you'll use a user terminal where you add sudo before a command if you need to execute something as root. I usually have one terminal opened as root and another terminal as user.
$ sudo -s
# apt-get install build-essential
Next, the packages for Python and Apache. Like most packaged Linux
distributions, Ubuntu requires a separate install for the development
libraries of each piece of software:
# apt-get install python2.4 python2.4-dev
# apt-get install apache2
The apache2 package usually install apache2-mpm-worker, but maybe you have the other version apache2-mpm-prefork installed. To be sure which one is installed you can execute:
# dpkg -l|grep apache2
Then install the corresponding development package,
apache2-threaded-dev if apache2-mpm-worker is installed,
apache2-prefork-dev if apache2-mpm-prefork is installed:
# apt-get install apache2-threaded-dev
Repoze uses Python's setuptools, so that package is needed as well:
# apt-get install python-setuptools
It's possible the version provided by the Ubuntu package is not the latest. If you want to have more control of the installed version of setuptools and want to update it yourself when a new version is available, you can use the following method instead.
Download manually setuptools-0.6c8-py2.4.egg (or latest version, choose py2.4) and execute the command:
# sh setuptools-0.6c8-py2.4.egg
You can later update it with sudo easy_install-2.4 -U setuptools.
Now, the server is ready to install mod_wsgi. There is a package libapache2-mod-wsgi on Ubuntu 8.04, but it's the old 1.3 version. So we will build the latest mod_wsgi version. Please remove the libapache2-mod-wsgi package if you have previously installed it. We need to get the source directly from the download site and build it:
$ wget http://modwsgi.googlecode.com/files/mod_wsgi-2.1.tar.gz
$ tar xzf mod_wsgi-2.1.tar.gz
$ cd mod_wsgi-2.1
$ ./configure --with-python=/usr/bin/python2.4
$ make
$ sudo make install
Note that it is necessary to compile mod_wsgi using the same Python you will use to run your web site. Since Zope requires 2.4, the --with-python option was used to point to the newly installed Python.
Once mod_wsgi is intalled, the apache server needs to be told about it. On Apache 2, this is done by adding the load declaration and any configuration directives to the /etc/apache2/mods-available/ directory.
The load declaration for the module needs to go on a file named wsgi.load (in /etc/apache2/mods-available/ directory), which contains only this:
LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so
The configuration directives reside in the file named wsgi.conf next to wsgi.load. We don't create it now, but it can be useful later to add directives in it if you have more than one WSGI application to serve.
Then you have to activate the wsgi module with:
# a2enmod wsgi
Note: a2enmod stands for "apache2 enable mod", this executable create the symlink for you.
Actually a2enmod wsgi is equivalent to:
# cd /etc/apache2/mods-enabled
# ln -s ../mods-available/wsgi.load
# ln -s ../mods-available/wsgi.conf # if it exists
For apache 1.3 or apache 2 with an old directory layout, you may need to put the LoadModule line and the configuration directives you will see later inside the httpd.conf file in your apache's /etc directory. The soft links above will not be necessary in that case.
Repoze can be installed with setup tools and a Grok site can be easily created using the repoze.grok tool included.
Repoze allows Grok to run behind any WSGI server. This tutorial will show how to install Grok behind the Apache web server and mod_wsgi using Repoze on a brand new Linux virtual server.
Once we have mod_wsgi configured, the server is finally ready for Repoze. The first step is to create a new Repoze sandbox. Since Repoze, Zope and friends require lots of packages, the idea is to get a clean environment for your project, where packages do not conflict with existing or future packages from your normal Python installation. We use setuptools to install the virtualenv package:
$ sudo easy_install-2.4 virtualenv
Then we create the actual sandbox. Replace ${sandbox} by a directory where you would like to install the Grok sandbox:
$ virtualenv --no-site-packages ${sandbox}
$ cd ${sandbox}
$ . bin/activate
The --no-site-packages option ensures that the new Python installation does not inherit any packages from the normal installation.
After this is done, we finally install Grok using a specially packaged egg from the repoze.org repository. The next step installs Grok into the sandbox.
Normally you should only execute the following:
$ easy_install -i http://dist.repoze.org/grok/latest/simple repoze.grok
But repoze.grok 0.1.6 ships with grok 0.11, not the latest version.
So instead you get the repoze.grok in editable mode and edit the setup.py to include grok 0.13 (or the latest):
$ easy_install -b . -e -i http://dist.repoze.org/grok/latest/simple repoze.grok
$ cd repoze.grok
$ vi setup.py # set GROK_VERSION = '0.13'
$ vi setup.cfg # comment easy_install section
$ python setup.py develop
$ cd ..
This command was run using my regular user account and created a grok directory which is actually the Repoze sandbox. The repoze.grok egg that installs this also installs a few sample configuration files, which may be used almost "as is" to run the new site.
The last thing you need to do is to create the Grok instance:
$ mkgrokinstance .
This creates all the required configuration files in etc, directories (var and log) for a Zope instance and a 'bin/grok.wsgi' script for use with mod_wsgi.
The most important files are located on the etc directory of the sandbox:
- 'grok.ini', a Paste configuration file used to establish the Paste (WSGI) pipeline which will be used to serve up repoze.grok.
- 'zope.conf', a classic Zope 2 configuration file which can be used to adjust Zope settings.
- 'site.zcml', a boilerplate site.zcml that should be used to control ZCML processing.
- 'sample-users.zcml', a file that declares a user, useful for copying into users.zcml when you want to start the site.
Configure a password for the Manager, first copy the sample-users.zcml file:
$ cp etc/sample-users.zcml etc/users.zcml
and change the password in etc/users.zcml.
Now all you need to do is copy the apache2.conf file from the etc directory in your sandbox into your Apache configuration (with the appropiate edits for your site, of course).
This file contains an almost identical configuration to the one shown on the repoze.org site, on the deployment page.
Copy it in your /etc/apache2/sites-available/ directory:
$ sudo cp etc/apache2.conf /etc/apache2/sites-available/grok
You can give the name you want for the file, here it's called grok.
Here the final /etc/apache2/sites-available/grok file, after applying all modifications explained below:
WSGIPythonHome ${sandbox}
WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages
# python-eggs=/tmp/python-eggs
# please remove processes=4 if you don't use a ZEO server!
<VirtualHost *>
ServerName my.machine.local
WSGIScriptAlias /site ${sandbox}/bin/grok.wsgi
WSGIProcessGroup grok
WSGIPassAuthorization On
WSGIReloadMechanism Process
SetEnv HTTP_X_VHM_HOST http://my.machine.local/site
SetEnv PASTE_CONFIG ${sandbox}/etc/grok.ini
</VirtualHost>
This will run mod_wsgi in 'daemon' mode, which means it will launch a number of processes to run the configured WSGI application instead of using the apache process. Since Repoze uses virtualenv, the site-packages directory of the virtual Python used to run it needs to be passed in the python-path variable. To tell mod_wsgi which WSGI application to run, we use the WSGIScriptAlias directive and pass it the path to the desired application.
Please be sure to modify PASTE_CONFIG to look for grok.ini, not zope2.ini. This is an error in repoze.grok 0.1.6.
Actually I think this line can be delete, it's not used in grok.wsgi...
Graham Dumpleton says:
It is best not to specify 'processes=1' to WSGIDaemonProcess if you
only want one process, let mod_wsgi fallback to its default of
creating one process if 'processes' is not defined.
The difference is significant, because if you use the 'processes'
option, whether or not it is set to '1', it will be regarded as being
multiprocess in WSGI world. That is, wsgi.multiprocess is True. If you
don't specify 'processes' and let default of one process apply
wsgi.multiprocess will be False.
This all matters as stuff like interactive debuggers such as
EvalException from Paste will not work when WSGI says it is
multiprocess.
For more details see:
simplejson-1.7.1-py2.4-linux-i686.egg was installed as a zipped egg, so when the application will be started, this egg will be automatically extracted in the PYTHON_EGG_CACHE, normally "~/.python-eggs".
see http://code.google.com/p/modwsgi/wiki/ApplicationIssues sections:
- "Access Rights Of Apache User"
- "User HOME Environment Variable"
This directory depends of the HOME environment variable. The HOME apache user www-data is /var/www. You will get the error "[Errno 13] Permission denied: '/var/www/.python-eggs'" in your error.log apache file if you don't configure the user or python-eggs variable in the WSGIDaemonProcess directive. Tip: use tail -f /var/log/apache2/error.log in another console.
Example:
Please replace ${sandbox} by the path where you have created your sandbox, i.e your virtualenv directory.
The file included in repoze.grok contains:
WSGIDaemonProcess grok threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages
The process belongs to www-data.www-data and python-eggs cache directory will be "/var/www/.python-eggs".
You can add python-eggs variable:
WSGIDaemonProcess grok threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages python-eggs=/tmp/python-eggs
The process belongs to www-data.www-data and python-eggs cache directory will be "/tmp/python-eggs".
Or you can specify user and group variable:
WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages
The process belongs to youruser.youruser and python-eggs cache directory will be "/home/youruser/.python-eggs".
You can set both variables:
WSGIDaemonProcess grok user=${youruser} group=${youruser} threads=1 processes=4 maximum-requests=10000 python-path=${sandbox}/lib/python2.4/site-packages python-eggs=/tmp/python-eggs
The process belongs to youruser.youruser and python-eggs cache directory will be "/tmp/python-eggs".
Be careful, var/ directory and all files in it have to be writable by the user of the process.
I use the third. It allows me to test with paster serve and mod_wsgi (not at the same time of course!) without changing var/ permissions.
"WSGIReloadMechanism Process" is the default for daemon mode when running mod_wsgi 2.0c5 or later. It was previously "WSGIReloadMechanism Module".
Configure it explicitly doesn't hurt.
For more details of the different reload mechanism, see:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
and in particular "Process Reloading Mechanism" section.
Since this configuration is intended for production use, you will also need to
setup a ZEO server and configure the zope process to connect to it.
The ZEO server is already configured in etc/zeo.conf and you have the two scripts
bin/zeoctl bin/runzeo generated as entry_point at the installation of ZODB3 egg.
Note: The bin/mkzeoinst . command seems to be an old thing. It will generate bin/zeoctl bin/runzeo scripts and etc/zeo.conf only if they don't exist. If you remove these files and execute the command, you will have an old zeo.conf where it search zdaemon/zdrun.py in ZODB3 egg instead of zdaemon egg. The two scripts generated execute the same thing as entry_point generated but include already the "-C etc/zeo.conf" option. So don't execute this command, you have already all needed files.
You have to configure ZEO client in the zope.conf file.
Here is an example:
<zodb_db main>
cache-size 10000
<zeoclient>
server localhost:8100
storage 1
cache-size 100MB
name zeostorage
var $INSTANCE/var
</zeoclient>
mount-point /
</zodb_db>
Normally, you have to uncomment the existing section in zope.conf file and comment the other zodb section.
To run Grok behind mod_wsgi, you have to start the ZEO and restart Apache.
First start the the ZEO server (before reloading apache2):
$ bin/zeoctl -C etc/zeo.conf start
. daemon process started, pid=7538
But zeo dies and I don't know why. log/zeo.log gives me:
2008-07-28T15:32:06 INFO root daemon manager started
2008-07-28T15:32:06 INFO root spawned process pid=7538
2008-07-28T15:32:06 INFO root sleep 1 to avoid rapid restarts
2008-07-28T15:32:06 INFO root pid 7538: exit status 2; exiting now
Actually only this command works:
$ bin/runzeo -C etc/zeo.conf
You may prefer in this case to deactivate the ZEO server by reverting the change in zope.conf and removing processes attribute in the WSGIDaemonProcess directive.
The first time, enable your site and reload apache2:
$ sudo a2ensite grok
$ sudo /etc/init.d/apache2 reload
When you visit the site in a browser (http://localhost/site), you should see the Grok Admin UI. You should be able to log in using the admin login name and password (found in 'etc/users.zcml').
To stop your site:
$ sudo a2dissite grok
$ sudo /etc/init.d/apache2 reload
To reload your site:
$ touch bin/grok.wsgi
I suppose that you have other applications configured with apache, so we don't want to stop currently opened connections.
If you want currently open connections not to be aborted, don't use apache2ctl restart or apache2ctrl stop. Use instead apache2ctl graceful and apache2ctl graceful stop respectively. See man apache2ctl for more details.
Note on Ubuntu Hardy Heron, you can use this too:
# /etc/init.d/apache2 reload
# /etc/init.d/apache2 stop
Currently open connections are not aborted.
You may create a new grok project by invoking:
$ easy_install -U grokproject # to be sure we have the latest version (here 0.8)
$ grokproject --run-buildout=no helloworld
$ cd helloworld
$ python2.4 setup.py develop
Be careful, your virtual environment have to be activated to use the good python.
The helloworld/buildout.cfg and helloworld/bootstrap.py files will not be used.
Then create a ZCML slug, a file ${sandbox}/etc/grok-apps/helloworld-configure.zcml with only one line:
<include package="helloworld" />
You have to reload your site by touching grok.wsgi.
You Grok application is ready to run under mod_wsgi, and perhaps be combined with other Python WSGI applications on the same site.
To use an existing grok project, enter in your project, develop the egg and create a ZCML slug like above.
If you have this error in your error.log apache file, it means you have
more than one process trying to Lock your Data.fs. Two solutions:
- configure zope.conf to use ZEO server as the main storage
- or remove the processes attribute from the WSGIDaemonProcess directive and deactivate ZEO in zope.conf.