Offensive Programming

Way back in the good ole days when I went to college to learn programming, one of the things I was taught was to program defensively.  I was to check for error returns (remember, this was before exception handling became the rule, not the, um, exception) and try to deal with them appropriately.  When your code is full of error checking, it is, naturally, longer; it’s also harder to read, in my opinion.

But like I said, it was a long time ago.  I learned on BASIC when you had to use line numbers, and then I learned Pascal and FORTRAN and PL/1 (the first on Apple IIe hardware, the other two on an IBM mainframe running MUSIC of all things).  Meanwhile, I took a part-time job programming a Xenix system, and while the main work was in a proprietary system called Profile 16, I took the opportunity to learn K&R C.

Perhaps it was C where I learned my bad habits.  You see sample code in books bereft of error handling (since it’s much clearer to leave it out if it’s not actually needed for the example) and you start writing code like that.  At least, I did.  I mostly checked for error returns on file opening, since that’s when it seemed most likely to be needed.  But my code worked, at least, after I got the obvious bugs out.  Fortunately for me, very little of that code is still in service anywhere… maintaining it would be a bear.

I kind of fell out of programming for a while, as I started a business selling and servicing computers (yes, I’m a programmer AND a technician).  When I finally accepted some work, it was in Visual Basic (well, VBA actually), and I fell back into my old bad habits.

But it was different, this time.  VBA gave me useful error messages when my program failed, even going so far as to drop me into the debugger.  I trained my users to make appropriate notes if an error message came up, and then I went back and either fixed the code or added the appropriate error handling.  Later I had the joy (that’s not sarcasm, it really was joy) of learning Python, and again, useful error messages; in Python, it turned out to be very easy to log the errors for later review, which is wonderful.

Lately, I’ve noticed how much more productive this “bad” habit has turned out to be.  My users are all pretty happy, as far as I can tell anyway, and in two cases I’ve had nice compliments from outsiders who have experienced these programs.  Oh, they aren’t sexy or anything… just programs that do work that needs doing.

I’m calling my method offensive programming.  I charge into the code and create, well, whatever is needed, and except for the most obvious cases, I don’t bother trying to figure out what might go wrong in advance.  Instead, I deal with the problems when they arise.

More to the point, I’ve come to the conclusion that, unless you’re designing really critical things (air traffic control software, medical software, etc.), defensive programming is insane.  Your goal is to figure out what might go wrong before it ever does, and then deal with it in advance.  But programs (and the computers they run on) have more failure modes than anything else humanity has ever created.  Trying to anticipate them all is a waste of time, as far as I’m concerned, unless of course your software holds human lives in its virtual hands.

Useful git alias

I found this on Hacker News and I love it.  Since I used this blog as my project notebook, I thought I should perhaps document it here (lest I forget where I got it later).

Here’s the alias:

git config –global alias.lg “log –color –graph –pretty=format:’%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset’ –abbrev-commit”

More updates to PollyReports

Made my first backwards-incompatible change today; instead of a right = … value in the Element initializer, I’m using align = … for a more general solution.  Before, your choices were right = 0 (the default) for left-aligned text, and right = 1 for right-aligned text.  But I needed something centered, and so I looked into the Reportlab docs and found drawCentredString(); to use it, I had to change the parameter, obviously.  While I was in there, I discovered drawAlignedString(), which is really cool, so I went ahead and added it to PollyReports also.

align may be set to “left” (the default), “right”, “center” (or “centre”, I’m not picky), or “align” to get any of these modes of alignment.

I guess that’s the only “real” change in version 1.4; it’s uploaded to PyPI and github, as usual.

http://pypi.python.org/pypi/PollyReports

https://github.com/Solomoriah/PollyReports

Minor PollyReports update

I found an issue with the ordering and printing of group headers and footers, and I fixed that; the current release 1.3 is now correct, as far as I know.  Also, I’ve revised the code to act intelligently when no detail band is defined, since every once in a while, it makes sense to omit it.

The documentation on PyPI has been updated to reflect these changes:

http://packages.python.org/PollyReports/

Hm.  Guess that’s all I had to say.

PollyReports Tutorial

I’ve noticed that acceptance of a new software module or package for developers in the Open Source/Free Software world is greatly affected by the availability of a good tutorial. I mean, it seems obvious, doesn’t it? But I’ve also noticed that the original author of a project rarely writes a good tutorial.

EDIT 6/20/2012: I’ve moved the tutorial to PyPI; find it here:

http://packages.python.org/PollyReports/tutorial.html

What do they say about battle plans?

So, after posting that PollyReports was ready for use, I actually used it last night with a small report for one of my clients.  Turns out, it still needed work.

But now, it works.  There were a couple of things I had just plain forgotten, like… what if there are newlines in an Element’s text?  Answer: break up the text into lines and print them one after the other in vertical alignment, using the given font size and leading to space them out.  What about page numbers?  Well, oops.  I’ve added a sysvar parameter to Element initialization that can be used to access any of the parent Report’s variables.  All I really want is “Report.pagenumber” but I can see that there may well be other uses for this.

Though this particular report didn’t use it, I have other client’s reports that used Geraldo’s event system (mainly so the user wouldn’t decide a slow-generating report was borked).  Rather than add all those event hooks to PollyReports, I added just one: an onrender parameter added to Element, which is automatically passed to the Renderer when it’s instantiated.  When Renderer.render is called (i.e. when the data is actually output), onrender is called with a single parameter, a reference to the Renderer.  Assuming you called that parameter “obj”, the Element which spawned the Renderer is accessible as obj.parent, and the Report as obj.parent.report.

Making progress…

Wow, PollyReports.py is already usable!

I started on PollyReports yesterday morning, and as of right now, it’s usable.  It’s true, PollyReports lacks some functionality from Geraldo Reports, but as I said in my post yesterday, that was the plan.  Keep it simple, Stanley, or something to that effect.

Using Robin Parmar’s lines-of-code counter found here, I’ve counted the code lines in both projects.  Geraldo Reports consists of 90,138 lines of code (in my current fork, which is pretty close to the standard in terms of length), while PollyReports has just 1,345 lines.  These are the “minimal” numbers, with comments and blank lines ignored, and they include all the Python files in each of the respective directories.  This includes the sample data file for PollyReports… which is 1002 code-lines long.

The actual PollyReports.py is 382 lines long, including comments and blank lines!

I’m pretty proud of Polly.  She’s managed to mature nicely while keeping her girlish figure.  I’m sure, as time goes by, she’ll gain a little more weight, but hopefully she’ll never get close to the mass of Geraldo.

Okay, enough silliness.  What is still missing?  Two things come to mind:

1.  A means of adding fonts other than the standard PDF fonts.  Geraldo Reports handled this internally… but Polly doesn’t “know” you are using Reportlab, nor import any parts of it directly.  Therefore, if you want nonstandard fonts, register them with Reportlab before you pass your canvas to PollyReports.  You’ll be able to call on those fonts using whatever names you have registered, just as normal when using Reportlab directly.

What does this buy me?  The ability to use a wrapper and run PollyReports with something other than Reportlab.  The less of Reportlab’s API the wrapper has to replicate, the easier it will be.  Here’s the whole list of Canvas methods and attributes PollyReports uses:

canvas.drawRightString()
canvas.drawString()
canvas.line()
canvas._pagesize
canvas.restoreState()
canvas.saveState()
canvas.setFont()
canvas.setLineWidth()
canvas.setStrokeGray()
canvas.showPage()
canvas.translate()

There’s just no need to add anything to that list, other than perhaps the rect() method at some point (for a Box class, no doubt).

2.  Subreports.  I can’t think of a clean way to handle subreports, since there must be some way to retrieve the external recordset.  Perhaps an Element subclass where you register a callback to get the data?  Hmm.  Might do it just that way.

Anyway, I’m very pleased with this project.  I expect to be using PollyReports for several of my custom software clients very soon.

Are you interested in PollyReports?  Let me know!

Feeling much better now… throwing out Geraldo in favor of Polly

Gee, it sounds like I’ve changed my sexual orientation or something.  But it’s not like that at all.  As I noted in my previous post, I’ve found significant flow issues with Geraldo Reports which I have found rather more intractable than I then thought.  So I got to thinking, in my best Jeremy Clarkson mode, “how hard can it be?”

This morning I hacked out the first version of PollyReports.py.  You can see it here:

https://github.com/Solomoriah/PollyReports

The current version handles detail bands and page headers and footers.  I intend to add grandtotal and subtotal bands shortly.

With this module I’m taking a different approach than that applied by the developer(s) of Geraldo Reports.  First of all, PollyReports will never be as ambitious.  If I can manage to do so at all, PollyReports will always be contained within a single source file.  I am a bear of little brain and prefer my code small and simple.  In fact, I’m trying to follow the adage to create the “simplest thing that can possibly work.”

By contrast, Geraldo Reports has numerous source files, with the generator modules separated from the formatting modules.  I’ve felt from the start that this was not necessary.  PollyReports is designed around Reportlab, but does not import it at all; rather, it assumes that the Canvas object you pass into it will follow the Reportlab Canvas interface.  Pure duck typing.  Creating a wrapper that implements that interface adequately for PollyReports’ purposes shouldn’t be all that difficult; though I have no current plans to do so, I can easily imagine wrapping my MSWinPrint.py module in that way.

Right now, in fact, PollyReports.py imports nothing (except in the test rig, where Reportlab’s Canvas is imported).  Though it’s not really a good idea, doing:

from PollyReports import *

would likely work just fine for most people… there’s just not that much in PollyReports’ namespace, and I don’t plan to put much there.  I’m trying to implement all my utility functions as methods to avoid any excess names being imported.

So anyway, sayonara, Geraldo Reports.  It was fun while it lasted.  Well, not so much there at the end… like many relationships, this one is ending on a sour note.

Geraldo gives me a headache

My last two commits had to do with generator flow issues.  It all comes down to generators/base.py, and it’s giving me a headache… it still doesn’t work right.

The reports I’m generating have multiple levels of group header bands.  Each active group header band should reprint on each new page; this didn’t work correctly sometimes, mainly if a group header pushed a detail band off the page.  So I fixed that, no problem.

But then, I found that child bands would flow off the page if the parent band landed at the bottom (as utils.calculate_size() did not take child bands into account).  I added that calculation to the height check I had already committed, and it worked.  No problem.

Except… there’s still a problem.  If you use auto_expand_height, the generator obligingly adjusts the space consumed by the band.  But this addition space is consumed in the render_band function, after my extra size check.  I can’t just move the size check down, as the band will have then been rendered; I need to advance the page (and reprint those darn group headers) before that happens.

Gah.  So I’m stuck, for now.  One of the reports I’m generating has a subreport that expands the detail band, and this sometimes screws up the page position (and those headers).  I can’t see any way to avoid having the subreport, and the whole reason I used a child band was to ensure that a specific block of information floats at the bottom below that subreport.

I’m coming to the conclusion that Geraldo is structurally flawed.  Here’s how I see the flow working correctly:

  • Each row of the data source generates an in-memory band structure of some sort.  A simple list would actually work, where every item in the list would have a relative vertical position and height associated with it.  The whole band, including subreports and child bands, would be generated into this structure, and the height could be calculated by a simple iterative bounding box algorithm.  If you were really clever, the band structure could adjust the height automatically each time an element was generated into it.
  • But, the band structure I’ve just described hasn’t been rendered yet, just queued.  The generator loop would check to see if there is enough room left on the page for this band; if there is not, it would trigger a new page, re-render the current group header bands, and only then render the detail band.

The problem I see with Geraldo is, obviously, this isn’t being done.

But Geraldo represents a lot of otherwise-good work.  I don’t want to just throw it away.  Can it be fixed?  Possibly…

What needs doing is that whole render-the-detail-first thing.  If, instead of rendering directly into the Reportlab canvas, the detail band could be rendered into a sort of side-canvas or pseudo-canvas first, the correct size could then be figured accurately.  One of the main issues with this is the fact that the generator is looping; my fixes described above involve breaking out of the loop before the rendering takes place.  I suspect I’ll have to render into my side-canvas, breaking if necessary, and thus causing the band to render twice (as the next pass through the loop would find the same record ready to process, just as it does now).

I don’t like that solution on the surface; it will be important to ensure that functions called from detail band rendering do not produce side effects, since they would effectively double up.  (This “shouldn’t” happen, of course.)  I’ll also have to verify that this procedure won’t throw off any running totals being maintained by Geraldo… and this is something that certainly will happen, so I can’t handwave it.

Egad.  Time to call it a night, and hope I wake up smart enough to do this.