Tuesday, June 29, 2010

logrotate: rotating logs in multiple directories

I've got a server with almost one hundred web-sites. Each of the sites is in its own directory and runs its own logs in /usr/local/www/SITENAME/logs/*.log. When the logs grew up, I decided to set up logrotate. Since there were so many web-sites, my first idea was to create one configuration file for every site in logrotate.d and leave just one line in logrotate.conf:

include /usr/local/etc/logrotate.d

Only one hour later I understood that I can just write a path with multiple meta-characters to include ALL logs in just one line:

/usr/local/www/*/logs/*.log {
    rotate 60
          [ ! -f /var/run/nginx.pid ] || kill -USR1 `cat /var/run/nginx.pid`


Thursday, June 24, 2010

PINBA: PHP Is Not a Bottleneck Anymore

Yesterday I installed Pinba on one of my servers. Pinba is a set of tools to monitor performance of PHP scripts. Pinba MySQL database engine runs a list of timers and automatically fills report databases. Pinba PHP extension uses two functions to open and close these timers. Besides, there are default timers, which open when a script is executed and close when it finishes. If you put a timer around some critical piece of code you can get information on how often it runs and how much time it is being executed. Timers can be tagged and the data can be grouped by tags. So, in some pieces of code you can set tags "author" and "task" with corresponding values. Then you'll manage to compare the performance of code written by different developers and identify the most time-consuming parts. The most interesting thing is that when you create database tables following certain rules, these tables become automagically filled with the necessary data. So, if you use the tags "author" and "tags" to group the data, the reports will include all valid combinations of these tags and show summaries on these combinations: how much time Joe's scripts parsed new documents and how often Jackie's front-end scripts were called. Very impressive.

By default, Pinba stores this information for some limited period of time (15 minutes, IIUC), so you need some way to make the data persistent. Since we use Munin to monitor various system indicators, I wrote a couple of plugins (in TCL and Lua, just for fun :)) to display the frequency of execution and the average execution time for each timer. Our developers added a handful of timers in various places of code and here it is:

The last graph looks cluttered and not too informative, so I plan to employ Munin's 'suggest' feature to draw some diagrams using one script. Perhaps, organizing the graphs will be the most difficult part of the deployment. I have to say, though, that the installation was not simple, either. Prerequisites include compiled sources of the installed MySQL (Percona Server 10.2 in my case), Google Protocol Buffers, Judy library, libevent 1.4.1+ (Ubuntu's default one will do) and Hoard memory allocator. And here is the installation process (paths will be different for you, so check carefully):

wget http://pinba.org/files/pinba_engine-0.0.5.tar.gz
tar -xzf pinba_engine-0.0.5.tar.gz
wget http://pinba.org/files/pinba_extension-0.0.5.tgz
tar -xzf pinba_extension-0.0.5.tgz
wget http://protobuf.googlecode.com/files/protobuf-2.3.0.tar.gz
tar -xzf protobuf-2.3.0.tar.gz
wget http://downloads.sourceforge.net/project/judy/judy/\
tar -xzf Judy-1.0.5.tar.gz
wget http://www.cs.umass.edu/%7Eemery/hoard/hoard-3.8/source/hoard-38.tar.gz
tar -xzf hoard-38.tar.gz
sudo aptitude install libevent-1.4-2 libevent-dev
cd protobuf-2.3.0
make -j
sudo make install
cd ../judy-1.0.5/
sudo make install
cd ../hoard-38/src
make linux-gcc-x86-64
sudo cp libhoard.so /usr/local/lib
sudo cp *.h /usr/local/include
sudo ldconfig
cd pinba_engine-0.0.5/
./configure --with-mysql=/home/minaev/Percona-Server-10.2/ \
--with-judy=/usr/local --with-protobuf=/usr/local \
--with-event=/usr --libdir=/usr/lib/mysql/plugin/ \
sudo make install
echo "INSTALL PLUGIN pinba SONAME 'libpinba_engine.so'"|mysql
echo "CREATE DATABASE pinba"|mysql
mysql -D pinba <default_tables.sql
cd pinba-0.0.5/
sed -i 's/NOTICE/CHECKING/' config.m4
./configure --with-pinba=/usr/local
sudo make install

I had to edit config.m4 because my version autoconf was a bit buggy. After this process you'll have to add three lines to your php.ini:

pinba.server=[MySQL server address]

And here is one of Munin plugins, written in TCL. It collects data on how often certain API parts were called.


package require mysqltcl 3.05

proc clean_fieldname arg {
    return [regsub -all {[^A-Za-z]} $arg "_"]

set dbuser "pinba"
set db "pinba"

set conn [::mysql::connect -user $dbuser -db $db]

set fields [::mysql::sel $conn 
  "select concat(module_value, '+', action_value) from \
tag_info_module_action" -list]

if {$argc > 0} {
    switch [lindex $argv 0] {
        "config" {
            puts "graph_title PHP Actions per second"
            puts "graph_vlabel reqs per second"
            puts "graph_category Pinba"
            foreach fld $fields {
                set clean [clean_fieldname $fld]
                 puts "$clean.label $fld"
                 puts "$clean.draw LINE3"
        "autoconf" {
            puts "yes"
} else {
    foreach fld $fields {
        set clean [clean_fieldname $fld]
        set data [::mysql::sel $conn 
  "select req_per_sec from tag_info_module_action where \
  concat(module_value, '+', action_value)='$fld'" -list]
        puts "$clean.value $data"

::mysql::close $conn

BTW, you may find it interesting that the performance of TCL scripts was almost the same as that of Lua scripts and about 3-4 times higher than for Bash.

Tuesday, June 15, 2010

Dark sides of Python

While reading about Python and playing around with its objects and classes (OOP being a perversions in itself), I witnessed a slightly weird behaviour. Define a class with a class variable.

class Parent:
  variable = "parent 1"

Then define a descendant class that inherits the class variable:

class Child(Parent):

(That funny single pass stands for empty definition body) Now, let's have a look at the value of variable in Parent and Child:

print Parent.variable
parent 1
print Child.variable
parent 1

Then, change the value of the variable in the parent class and it should also change in the child class:

Parent.variable = "parent 2"
print Parent.variable
parent 2
print Child.variable
parent 2

Sounds good. The variable must be shared between the two classes. Now, let's change the value of this allegedly shared variable in the child class:

Child.variable = "child 1"
print Parent.variable
parent 2
print Child.variable
child 1

Quite of a sudden, the variable turns out to be two separate variables. We have somehow broke the link that connected them and now, even if we change the value of the variable in Parent, this will not affect the variable in Child anymore:

Parent.variable = "parent 3"
print Parent.variable
parent 3
print Child.variable
child 1

And how a language with such non-trivial idiosyncrasies can be promoted as newbie-friendly, "does-what-you-want" language? My interest to Python evaporates so fast that I will probably never get to the famous included "batteries".