Tuesday, October 3, 2017

Why Jenkins is a wrong CI tool

Jenkins is one of the most awkward tools I ever met outside of Windows domain (in the Windows world they are the norm). I would say that this favorite tool of Devops engineers reflects the identity problems of the Devops engineers themselves. Basically, Jenkins is a build automation tool built over some other build automation tools, like Maven or Gradle. If you ask me, you should avoid using it like fire, so many problems it brings. Here are some of the flaws of Jenkins (or what I consider as flaws), some of them innate, others acquired later, in no particular order.

  1. Plugins may be a good thing when all you need is to add a couple or two of features. Jenkins is nearly useless without plugins. You'll need dozens of them, if not hundreds. Is it bad? It is. Plugins make every Jenkins server a snowflake — unique and unreproducible. The modern trend for automation stalls when it meets Jenkins. Jenkins cannot be presented as a code, you can't automate the rollout of a new build server with all the plugins you will need.
  2. Plugins may be incompatible. If all those features were built into Jenkins, the incompatibilities could have been identified during the testing phase, but they aren't. Have you ever tried to clone a Git repo and its submodules while using a multibranch pipeline? No way.
  3. Inconsistent GUI. More than once I saw a situation where a Jenkins job copied physically to another server looks differently because the plugins were of different versions or they were installed in another order or whatever. The order of boxes in the pre-build section could be completely different, making the copy of the job unrecognizable.
  4. While we are at it, the Jenkins' GUI is a typical example of the "Made by developers"™ approach, meaning it's just disgusting (no offense meant for the developers here :). See "This Is What Happens When You Let Developers Create UI" for the details. There's an attempt to redesign the Jenkins' GUI called Blue Ocean, but it's just a superficial change.
  5. Groovy as a scripting language is a pretty strange choice. Scripting languages are supposed to hide implementation details and abstract certain operations. Groovy scripts in Jenkins, on the contrary, require extensive knowledge of Jenkins internals: class names, methods and so on.
  6. When you first see a working Jenkins server, it's nothing but an alphabetic list of jobs. Some of them are only called by other jobs, others may be never used remnants of the past, while still others are real top-level jobs triggered either manually or by a scheduler. The only way to understand what is what, which job call which jobs which call some other jobs and in what order, is to physically browse all of them and sketch out a graph on a piece of paper. It may take more than one day to finally grasp the web of their interdependencies.
  7. In a similar vein, the parameters for the jobs can be defined manually, calculated by a script, passed from another job, inherited from a parent job,etc. Together with the graph of the calls you'll have to compile a list of parameters and their possible values. Once again, use your eyes to find them all by skimming over all the jobs.
  8. When you're trying to build a new job, there's no way to test it. You can only run the job on the server and hope for the best. It's one more obstacle on the road to CI as a code.
  9. There are too many ways to write a job. You can make a vanilla freestyle job out of ready-made Lego-like blocks. Or you can write a freestyle job that builds the project by running a shell (or Groovy, or Python) script. Or you can create a pipeline. But you'd better watch out, because there are two different ways to write a pipeline, a scripting syntax and a declarative one. Or you can put your premade pipeline to a file called Jenkinsfile and store it with your code, Jenkins will create a job from Jenkinsfile automatically. There's more than one way to do it in Perl, too, but in Perl it's still the same language. In Jenkins, two jobs can be two completely different build systems.
  10. The changes in the build process always depend on the changes in code. Hence, the jobs evolve just like the project itself, but those changes are not kept anywhere in Jenkins. Of course, "there's a plugin for that"©, but that plugin doesn't keep the correspondence between the state of the project and the job itself, as a version control system could.

The last point is mitigated by the concept of Jenkinsfile, but it's not a silver bullet, either. Not all plugins can be used in Jenkinsfile. Not all libraries can be called from Jenkinsfile. It runs in a sandboxed Groovy. AFAIK, Jenkinsfile doesn't support post-build steps. And so on, so on.

Okay, but how a better CI tool should work like? I'm not sure, I haven't seen one. First, it should be more like a DSL, kept together with the source code. The DSL may be based on a well-known scripting language, like Tcl, Scheme or Python. Second, the build process used by the tool should be easily reproducible by developers locally. Third, the build results may be represented in HTML, but the web interface should be clearly separated from the build process itself. It may be too long to keep counting, it's a topic for another article. But for now, I'm not yet aware of a tool like that, but the time is nigh. Someone should write it. I just can't wait to ditch Jenkins for something better.

Thursday, April 3, 2014

Developers vs Humans

The bits of modern desktop environments I sometimes see send me into the depths of despair. All familiar ways of modifying the system behaviour fail. No more scripts. No more pipes. No more rc-files.

My monitor is usually rotated 90° clockwise. Books and magazines look better this way. To rotate the picture, I usually edit /etc/X11/xorg.conf. The exact spell depends on the video card. Now, it says:

Option "Rotate" "left"

Once upon a time, after an OS upgrade, it stopped working. That is, GDM was displayed in the usual landscape mode. If Xorg was started with 'startx', the orientation was correct, but GDM refused to note the "Rotate left" option. And the display remained that way even after the control was passed from GDM to the window manager. For many days I had to type xrandr --rotate left every time I logged in. I was sure that the problem was in Xorg settings. Then, one day, I decided to install a different desktop manager, Slim. And it suddenly appeared in the right way, in portrait mode! So, the problem was in the GDM settings, but I just couldn't find them. I still haven't and I never will. But the logic of the developers who made that piece of crap to pay no attention to the X11 configuration files, escapes me.

By the way, the switch to Slim solved another problem. I never liked the dumb default notification daemons in Ubuntu, NotifyOSD and KNotify. I'd like to switch to much more configurable 'dunst' notifications, but I couldn't get rid of knotify4, which was started automatically. But in the end I found the place where it was defined: /usr/share/dbus-1/system-services/org.kde.knotify.service I can only ask: how was I supposed to find this place or to change the defaults? Or was I supposed not to change them at all?

Now, tell me, please. Did the developers think they were doing the right thing when they took these decisions? Why do distribution owners accept these weird, uncomfortable and unconfigurable solutions? And, finally, isn't it the reason why modern Linux failed on the desktop?

Monitoring multiline logs by email in full color

I use tools like 'logwatch' and 'logcheck' to monitor new events in logs, but they have serious drawbacks. It is most obvious when it comes to multiline logs like PHP error log or MySQL slow log.

So, I wrote a trivial script that would send email messages with new events from the logs. Besides, the messages must be htmlized/colorized, highlighting the SQL/PHP syntax.

So, here's the script:

#!/bin/bash

L=$(/usr/sbin/logtail2 -f $2 )
if [ "x$L" != x"" ]; then
        echo "${L}"|source-highlight -s $3 -f html|mail $1 -a "Content-type: text/html" \
-s "$(hostname): $2" fi

The syntax is: logmail <recipients> <log file> <syntax>

To extract only new events from the log, I use 'logtail2' from 'logcheck' package. For syntax highlighting I chose 'source-highlight' (it was also used to highlight the script code above).

So, to process MySQL slow log, call the script like this:

logmail name@mail.host /var/log/mysql/mysql.slow.log sql

Or, to produce a report from the php-fpm slow log:

logmail name@mail.host /var/log/php-fpm.slow.log php

The results may look like this:

# Time: 140403 12:01:14
# Thread_id: 12983054  Schema: dsa  Last_errno: 0  Killed: 0
# Query_time: 12.672162  Lock_time: 0.000246  Rows_sent: 300  Rows_examined: 5604906  Rows_affected: 0  Rows_read: 5604906
# Bytes_sent: 14625
SET timestamp=1396512074;
SELECT
       document_id
     , external_document_id
     , DATE_FORMAT(created_at, "%Y-%m-%d") AS created_at
     , source_id
    FROM
     document
    WHERE
     (
      #(
      # status = 'preparsed'
      # AND flag = ""
      #)
      #OR (
       status = 'converted'
      #)
     )
     
     
    ORDER BY
     document.source_priority DESC
    LIMIT
     0, 300;

Monday, April 29, 2013

Extending HP SmartArray RAID array and expanding root file system using LVM

Not too ancient Linux distrubutions use LVM even when installed on RAID array (I used to mistake LVM for RAID some time ago). The multitiered construction becomes a bit overcomplicated: you use RAID to merge hard drives into one logical drive, which is split into partitions, which are then merged into logical volumes. What should we do to grow our root file system onto new hard drives when the RAID size is not enough anymore?

Firstly, insert the new drives into the slots and then use hpacucli utility to check that they are recognized now:

hpacucli ctrl all config show

New drives should appear in the 'unassigned' section. Now, merge them into the existing array:

hpacucli ctrl slot=1 array A add drives=allunassigned

Now, wait till RAID finishes the work. It may take some hours. When it's over, attach these drives to an existing logical drive:

hpacucli ctrl slot=1 ld 1 modify size=max

If you try to run fdisk/gdisk now, you will not see the free space, because it was not available to the operating system during the boot time. You'll have to reboot. After reboot, use fdisk or gdisk to create a new partition on the free disk space. The partition must be assigned type 8E00 (Linux LVM). The operating system will not recognize the new partition immediately. You may try using partprobe command, but I preferred to reboot once again.

Next, you have, in sequence, to create LVM physical volume, extend LVM volume group onto the new volume and extend LVM logical volume to use the free space:

pvcreate /dev/cciss/c0d0p3 vgextend VolGroup00 /dev/cciss/c0d0p3 lvextend -l+429214 /dev/mapper/VolGroup00-LogVol01

To determine the sector number you will use in lvextend command, find the information about free physical extents in the output of vgdisplay:

vgdisplay |grep Free Free PE / Size 429214 / 1.64 TiB

And, finally, grow the file system. If you are conservative enough to use the default ext4 system, you can do this using:

resize2fs -p /dev/mapper/VolGroup00-LogVol01

Thursday, August 23, 2012

Find files modified on certain date or in a date range

When you want to find files matching certain conditions, this is most probably job for the find command. The problem, though, is that you can't specify a certain date.

Options like -atime, -ctime or -mtime receive an argument that specifies the number of 24-hour periods. This means that when you run the following command:

find /usr/log/mysql -type f -mtime -1 -exec ls -l {} \;

you will get not only the files modified today, but those modified within the last 24 hours. You can change this behaviour by adding the option -daystart, which means that the time periods are calculated from the beginning of the current day:

find /var/log/mysql -type f -daystart -mtime -1 -exec ls -l {} \;

This command will produce the list of files modified today. IMPORTANT! Note that the -daystart option must precede all date-comparing options to have an effect.

To find files modified between two dates you can join two conditions using -a option:

find /var/log/mysql -type f -daystart -mtime -3 -a -mtime +0 -exec ls -l {} \;

The result will include the files modified yesterday or the day before yesterday.

Sometimes, though, you may want to specify the dates as they are, not as relative number of days from today. Traditionally, it was done using an awkward technique that involved creating two empty files with modification dates corresponding to the lower and upper borders of the range (using touch -t filename) and then using these files in options -newer and -older:

touch temp -t 200604141130
touch ntemp -t 200604261630
find /data/ -cnewer temp -and ! -cnewer ntemp

(example taken from here)

New versions of find allow you to do just that using -newerXY. Letters X and Y here stand for some one-letter codes corresponding to various comparison types. The combinations are pretty much incomprehensible, but what we need is -newermt. With this option, life gets simple and sunny!

find /var/log/mysql -type f -newermt 2012-08-21 ! -newermt 2012-08-23 -exec ls -l {} \;

This command produces the list of files modified exactly between the beginning of August 21 and the beginning of August 23.

And a little bonus for those who made it to the end! To sum the sizes of the found files (so you can find out, for example, how many gigabytes were written to binlogs in the last two days) use du -c:

du -c `find /var/log/mysql -type f -newermt 2012-08-21 ! -newermt 2012-08-23`|tail -n1

Thursday, July 26, 2012

Chroot, but don't chroot

FTP, SFTP and SCP accounts are often restricted to their home directory, so the users don't mess around with the system. This is done using 'chroot'. You can set up SSH demon or FTP server so the user cannot leave his home directory. But what if you want to give him access to some other directory outside his home directory?

The first thing that comes to mind is links. Soft links don't work, though, because the user cannot see anything outside his directory. Hard links won't work, either, because you usually cannot hardlink directories. They'll do if all you want is access to a single file, though. Also, you can change the user's home directory to the directory you want him to modify. But you may want to grant access to more than one directory. Besides, the implementation of chroot requires that if the user is chrooted to /var/data/lib/img, all directories up the tree (i.e., /var, /var/data and /var/data/lib) must belong to root and nobody else should have write permissions there. This is not always possible.

But the right solution would be to mount a directory to your home directory using bind option. Create the mountpoint inside the home directory and mount:

mkdir /home/remote/img
mount --bind /var/data/lib/img /home/remote/img

Now, the user will be able to work with /var/data/lib/img, but not with any other data on the server.

To make the mount persistent across reboots, add the corresponding entry to /etc/fstab:

/var/data/lib/img /home/remote/img none bind 0 0

Friday, February 17, 2012

More than one 'exec' action in 'find' command

When you run find command, you can pass the names of the found files to an arbitrary command using -exec option:

find /tmp -mtime +3 -exec rm {} \;

The curly braces get replaced by the name of found files and the command is executed for every file. However, if you want to run more than one command on the file or use the filename more than once in one command, you cannot do that:

Only one instance of `{}' is allowed within the command.

To bypass the limitation, you can execute a shell, passing the filename as an argument. In the commands executed by the shell, the argument will be available as $0:

find /tmp -mtime +3 -exec sh -c 'ls -ld "$0"; rm "$0"' {} \;