Friday, October 29, 2010

Upstart, a better init or a more painful one

For some time now Upstart has slowly been taking over the old SysV init scripts on most Linux distributions, for better or worse. Now some of the things they are trying to do with Upstart are good and have some merit, though at the cost of simplicity and ease of use. I fear the developers and supporters are worrying more about the 5% use case rather than the 95% use case and they have thrown the old mantra of keep it simple and stupid out the window. The problem when you give the 5% too much focus, you end up creating a solution that is horribly broken and resembles solutions from the little company up in Redmond Washington.

For those struggling to find your way through the new hell of Upstart here are a few pointers:

1. Dependency hell is brought to a whole new level of pain and frustration. Imagine a system where everything can emit and event and things “should” start or stop based on those events. Hey that is a great theory. But how on earth do you troubleshoot it. How do you get your system to start things in an exact order without building a chain like what we had back with SysV init scripts. In the old days of SysV, you gave all of your scripts a number, and they started in order from lowest to highest. If you wanted something to start sooner, you just lowered the number. Simple. Now you have to try and go through and figure out what starts when and what needs to happen before everything else. There is no tool that will print out a dependency map and there is no logging or debugging to speak of.

2. The location of Upstart configuration files can be found at /etc/init/. The old remaining SysV init scripts that can not yet be converted to Upstart can be found at /etc/init.d/.

3. The scripting support in the “script” blocks of Upstart is very limited and extremely basic. Basic things like echoing out data to the console does not work which makes it hard to try and use the scripts for anything other than shotgun starting a service or kill-9ing it.

4. The default answer to stopping a daemon in Upstart is to sent it a SIGTERM and wait a few seconds and if it is not stopped to send it a SIGKILL, a kill -9. Holy crap batman, there goes my database server that takes a while to shutdown cleanly.

5. You can not create any custom function blocks to pull extra status or details from a server that might support it. So your only option is to create a separate tool to manage the daemon. If you are going to do that, why not just stay with the SysV init scripts.

6. Most Linux Distributions are pushing this down our throats fast and hard, even when it is not fully baked or fully functional yet. This reminds me of the tactics that come from a little company in Redmond Washington.

7. Upstart can not manage all services yet, and some just do not start at all with Upstart. In the effort to build the most advanced leaky abstraction, some times a daemon will fork once or twice and Upstart can not track it and thus you get in to a state where Upstart “thinks” the service is stopped or started when it really has not. So imagine if that was your database server again, it would suck if the system went ahead and shutdown the filesystem even though the database server is not even close to shutting down.

I have spoken with the lead developer of Upstart several times. And he seems reasonable and desires to fix at least some of the problems and weaknesses in Upstart. However, I fear that there are some that he will not fix. Some of the things that we need it to do are not in the vision of it.

The things I would change about Upstart, sooner rather than later are:

1) Make the scripting engine inside the script blocks work with all script commands and be a fully functional scripting engine.

2) Provide a tool that prints out the exact order that things will start at boot with a parenthetical list next to each entry with its dependencies.

3) Provide the ability to add arbitrary functions/scripts to the Upstart configuration files so you can do something like “initctl apache super-details” and have it run a script block inside the configuration file called “super-details” which does what ever you need it to do.


I hope that Upstart and other proposed replacements to SysV init scripts can focus on making a better solution and not focus on trying to solve a 5% use case problem that causes pain and frustration for the other 95%.

Wednesday, October 6, 2010

PDF Document Manipulation

With the advent of advanced mobile reading devices that support full graphic displays, such as the iPad and soon to be released Android based tablets, we in the network security space find ourselves wanting to bring our entire reference library of books, ISO Standards, RFC, and diagrams with us everywhere we go without the concern of do we have internet connectivity and do we remember where all of the documents are without having to dig around the Internet.

The problem, however, is that a lot of the reference documents that we use are published on-line in text or PDF format and are only available in pieces (chapter by chapter).  This was probably done to save bandwidth and for ease of downloading and viewing.

This type of on-line publishing can work well when using a full sized computer that has disk storage or a real file system and keyboard/mouse.  This does not work well for mobile readers that do not have disk storage file system to store hundreds of separate files and may be used in places where you do not have internet access (on a plane over the Atlantic or in that basement corner conference room that is just out of wifi range).

We are going to address how we can manipulate on-line PDF documents that are published in parts to turn them in to one PDF “book” for easy of use on a mobile reader.  We will also discussion in Example 2 how to fix bookmarks that no longer point to the right place due to adding or taking pages away. 

Example 1:
We have found a book online that has 4 separate chapters that we would like to make a single PDF “book” out of for our mobile reader.  Each chapter is a separate download called chapter1.pdf, chapter2.pdf, etc.  The first page of each PDF is actually a blank page and the second page is note/summary/title page, neither of these pages we want to keep.

Step 1.1:
Download all related files to your computer. You will need to have local copies of all the files for the book or standard document in question.  So lets download them to say /home/pdf/

Step 1.2:
Remove any leading or artifact pages that are not relevant to the combined book.  Some times you will find that PDF documents have a leading blank page, or leader “summary” page, or master title page for every chapter.  When this is not desirable, we can easily remove them prior to stitching the chapters together.  (Be mindful of any copyright or usage rules when making changes).  The tool we will be using is called “pdftk”.

[jordan]:/home/pdf-> pdftk chapter1.pdf cat 3-end output new-chap1.pdf

What this command will do is take the current PDF of “chapter1.pdf” and create a new PDF document called “new-chap1.pdf” but it will skip the first two pages.  It will start on page 3 and go to the end of the PDF document.  Lets do this for the other 3 chapters as well.


[jordan]:/home/pdf-> pdftk chapter2.pdf cat 3-end output new-chap2.pdf
[jordan]:/home/pdf-> pdftk chapter3.pdf cat 3-end output new-chap3.pdf
[jordan]:/home/pdf-> pdftk chapter4.pdf cat 3-end output new-chap4.pdf


Step 1.3:
Now that we have removed all of the leading white pages and title pages, let us combine the 4 chapters in to one PDF document called “book.pdf”


[jordan]:/home/pdf-> pdftk new-chap1.pdf new-chap2.pdf new-chap3.pdf new-chap4.pdf cat book.pdf


Step 1.4:
Now that we have a single PDF document that has all the chapters stitched together, we can now look in to adding PDF bookmarks to the file to make jumping around easier on our mobile reader.  This can easily be done with a tool called “jpdfbookmarks” that you can download here: http://sourceforge.net/projects/jpdfbookmarks/.  As of this writing I am using version 2.4.1.  From this tool you can add bookmarks and sub-bookmarks very easily from their graphical interface.  We will create a bookmark for Chapter 1, Chapter 2, etc, and we will also create sub-bookmarks called Chapter 1.1, Chapter 1.2, etc for all of the sub elements in the chapter. Once you are done save your changes.


Example 2:
We have a PDF document that we either created in Example 1 or that we downloaded from the Internet that we would like to add a title page or a picture page (picture of the cover of the book) to the front of the PDF document.  Now ideally if this is from Example I, we would have done this during Step 1.3 so as not to mess up all of our bookmarks.  If we try to just use the pdftk command by itself, all of our bookmarks will be off by the number of pages that we insert.  This method would also work if we needed to remove a page and wanted to keep all of our bookmarks.  So what we can do is the following:

Step 2.1:
Create our title page as a PDF document or convert the PNG/JPG picture of the book to a PDF document and call it “cover.pdf”.

Step 2.2:
Lets add the “cover.pdf” to the first of the “book.pdf” document.  NOTE: when we do this all of the bookmarks will be off by the number of pages that we insert, but I will show you how to fix this. For the sake of explanation, let assume that the “cover.pdf” is only 1 page long.


[jordan]:/home/pdf-> pdftk cover.pdf book.pdf cat output bookwithcover.pdf


This will create a new PDF document called “bookwithcover.pdf” that will have the new title page or picture cover added to the front of the book.  For our example, all of the bookmarks will now be off by one page.

Step 2.3
First we need to export our current bookmarks so we can fix all of them in mass.  If we use the “jpdfbookmarks” tool that we used up in Step 1.4 we can “dump”/export all of the current bookmarks for the “bookwithcover.pdf” document to a text file.  This file will look like:


Chapter 1 - BookmarkNameAAA/1,Black,notBold,notItalic,closed,FitPage
    1.1 BookmarkNameBBB/1,Black,notBold,notItalic,open,FitPage
    1.2 BookmarkNameCCC/2,Black,notBold,notItalic,open,FitPage
Chapter 2 - BookmarkNameDDD/3,Black,notBold,notItalic,closed,FitPage
    1.1 BookmarkNameEEE/3,Black,notBold,notItalic,open,FitPage
    1.2 BookmarkNameFFF/4,Black,notBold,notItalic,open,FitPage
etc….


The “BookmarkName???” is just the name that you gave the bookmark with you created it and is what shows up in the bookmarks pane in your PDF viewer.  The slash “/” # that follows the name is the page in the PDF document that it points to. 

Step 2.4
We can now use a short piece of PERL to go in and fix all of the bookmarks in mass.  This assumes that you saved the bookmarks as “dump.txt” and the changes will be saved to “dump1.txt”.  We will also be changing all of the bookmarks by one page since we only added one page.  NOTE: this is written in long hand PERL for readability.


#!/usr/bin/perl

$file="dump.txt";
open (DATAIN, $file);
@fileline = ;
close (DATAIN);

$file1="dump1.txt";
open (DATAOUT, ">$file1");

foreach (@fileline)
{
    m/\/(.{1,3})\,/;

    # This is where we increase it by 1.
    $newpage = $1 + 1;
    s/\/.{1,3}\,/\/$newpage\,/;
    print DATAOUT "$_";
}
close (DATAOUT);


Step 2.5
Now that we have a file called “dump1.txt” that has all of the corrected bookmarks, we need to import that back into our “bookwithcover.pdf” document.  Let’s use the “jpdfbookmarks” tool once again to do this for us.  Open the PDF document with jpdfbookmarks and then use the “Load” function in the “Tools” menu to load “dump1.txt” which is the new bookmarks.  Save the PDF and you are done.