Smoking toooooo much PHP
Tuesday 29 March 2005
I had the pleasure (pun intended) of installing a small framework of code today, which broke at least half of the rules I've been building up for the projects I've been working on. The code illustrated very clearly why explicitly typing require_once is not only good idea, it can make the difference between clear readable code, and poor magic. The error " Fatal error: Call to undefined function: somefunction_xyz() in..." appears after installing the code. Looking at the file, it only contains one line.. somefunction_xyz()! The framework is supposed to have loaded this file, but as it's not set up correctly, and therefore, it didnt happen. To me this assumption that the file is loaded is flawed to begin with. Ignoring the issue that the framework relies on function libraries, the other fatal flaw is that Frameworks should rarely load more that one 'action' file, which in turn should be reasonably self explainitory where it is getting things from. The missing require_once makes the code very difficult to follow without inside knowledge (or heavy use of grep) of how the framework may be working, and very little is given away as clues to what should have happened prior to this error occuring. I guess this harks back to the idea that __autoload() will encourage people to write more code that is less self documenting, almost all languages C#, Java, Python... usually have a list at the top of the page, indicating what they 'import' or 'use' to achieve the aim of the program, PHP uses the require_once to document the source of your libraries. It helps others read your code, and in PHP can also be placed close to the place you actually use the library method. Alot of these language have ways around ending up with this large import list, and often some import's implicitly load others, but in making code readable it's often worth duplicating these, just to ensure that it's readable. So from the trenches here, please try and make your code readable, other people have to install, set it up, and as quickly as possible understand what you intended to do....
Sunday 27 March 2005
For a change, I've taken break from bashing internals, and got back to real work. (More on DBDO later this week hopefully) One of my on-going projects, that has been dragging on longer than I would of liked is a shipping management application. I think it's mentioned in the archives, but for anyone who missed it, it is a mid sized XUL application which deals primarily with the management of a trading companies shipping requirements. I originally outsourced the main development, and have been tidying up and refining the code as we near final deployment (which as usual has taken longer than expected.) This week I sat down and focused on the last major part of the project, reporting. Almost all the requirements for reporting include the ability to download an excel file of the data. So previously I had been making heavy use of PEAR's Spreadsheet_Excel_Writer. In using it, I had gone through various stages of evolution - Writing raw Excel_Writer code in PHP, This however becomes very tedious, is not amazingly readable, kind of breaks the seperation of display/computation. And tends to be less flexible over a long period of time.
- Using a gnumeric as a template and using XML_Tree to merge data with it and output via Spreadsheet_Excel_Writer, again this helped in terms of enabling a simpler API for spreadsheet writing, and moving some of the layout/look and feel into the Gnumeric template. But the code for doing this was not quite as elegant as I would have liked.
- Using Javascript to read HTML tables and create a CSV file, that is sent to the server, and back again as text/csv mimetype (forcing the browser to open it in excel/openoffice etc.). Which was nice from an architectural point of view, by lacked any formating.
- And finally this week. Using javascript to generate a Spreadsheet_Excel_Writer specific XML file (by mixing a XML template file and the HTML content of the page), sending it to the server, and then letting PHP use the DOM extension and simple iteration with Spreadsheet_Excel_Writer to generate the page.
This weeks solution while not quite complete has a number of key advantages, some of which appeared after I started using it. - No display level code goes into the Action->Data manipulation stage (we just store the data ready for the template engine/ template to render)
- It is possible to visualize the data prior to it ending up in the excel file.
- hence debugging the data output and finding issues is a lot quicker
- More code reuse,
- the library for XML to Excel is simple to reuse,
- the code for extracting the data from the html and generating XML is simple enough for copy & paste. and maybe possible to create a js library eventually.
- It offers infinate possibilities for formating, and changing layout.
- Less memory intensive, the data retrieval/storage and excel file create are broken up into two seperate processes.
The extended entry includes a few more details....
View Extended Entry
Friday 18 March 2005
Someone asked on a few of my other posts why I refer to __autoload as evil, (well apart from making sensationlist statements to keep the blog interesting). lets start with what it's supposed to get rid of. require_once 'SomeClass.php'; $x = new SomeClass; The code above is reasonably predictable, require_once will look in the include path, and find the first match of SomeClass.php, the second line will create an instance of the class that looks like it's probably in SomeClass.php The only magic here is - Which of the include paths SomeClass.php might be in..
Now enter __autoload.
I first saw __autoload on the Zend developers list, it's one of the methods that you looked at and thought, 'is this really a good idea?'. But ignored it, since like all features of any language - ' You dont have to use it'.. or so I thought..
Autoload basically hooks into a few places so when you would normally get a ' this class does not exist' message, autoload is called to let you try and load it, and hence avoid this message.
It also hooks into class_exists(), and gets called to let you try and load the class then, hence the purpose of class_exists('PEAR') or require_once 'PEAR.php';On the face of it, the above looks like it is just saving you a file call that require_once isn't really supposed to be doing.. but no.. class_exists is really secret code for ' you can have a go loading the PEAR class from wherever you like'
The justifications I've seen for this are two fold - It's better from a performance point of view.
- It's more flexible.
The first argument, is extremely questionable, the microseconds that you may be saving, compared to parsing all the code that you have in all the classes is probably so tiny as to not be a particular issue. And almost all of it could be removed by using APC or similar if you really where that desperate for performance tweaks. The Flexibility issue is also questionable, What can you do with autoload that cant be done with include path? or more to the point, what are you doing messing around with include path and autoload locations in the first place.., trying to dig a bigger bug hole for you or someone else to discover later..? And finally into the fray spl_autoload
Meanwhile as all this was going on, Marcus added an autoloading toolkit to spl, the new ' Standard PHP Library', or perhaps the ' I need more classes library'. What has been added is the missing ability of autoload to be impliemented multiple times. You can only define one __autoload method per instance of PHP, however spl_autoload allows you to register as many handers as you like, hence multiplying an already magic tool at infinatum. Now you may as well prey that what you typed is actually going to be run as you requested.. Is include_path so complex, troublesome or unflexible that it needs to be replaced with something so much more complex and flexible?
or does somebody want to do something so horrific with __autoload that they are dieing for this tool?
Thursday 17 March 2005
I noticed a small thread on pear-dev about require_once, the concept that having require_once to lazy load files, is slowing things down seems to crop up every so often. The crux of the issue appears to be the impression that lazy loading is slowing things down somehow, and that doing something like this may improve performance. class_exists('PEAR') or require_once 'PEAR.php';
Or even worse, thinking about using __autoload magic.. In early versions of PHP4.3, and before, each require_once call had to do quite a bit of work to determine if a file had already been included. It made the assumption that you might have changed the include path, and therefore, the file you where requesting might actually not have been loaded. So each call went through your every path in your include_path, made sure each part of the directory existed, and the tried to open the file, this resulted in quite a few stat calls (via realpath), as well as a few opens. How much this was slowing things down was never really examined in detail, (although from what I remember Rasmus indicated that Y! had done a few patches to address this), but the existance of this patch and the general assumtion was that stat and open where relatively expensive made the situation sound kind of serious. After considering the issue, a few of the core developers (Andi and Rasmus I think) added a stat cache feature. So rather than stat'ing the whole path on each require, it looked it up in a cache. The result can be seen by running this strace php4 -r 'require_once "PEAR.php"; require_once "PEAR.php";' 2>&1 \ | grep -E '(stat|open|close|read)' | tail -30 As you would see from the output, what happens now is that the second call to require_once, calls open once on each possible location of the file (normally something like ./PEAR.php and /usr/share/pear/PEAR.php) This should be pretty efficient, as long as you dont modify the path during the your php script (like move a directory or something). However, as the discussion this week shows, this questionable performance issue still hasnt disappeared. So I got bored today and wondered what would be involved in making it even more efficient. (basically optimizing the second call to any [require|include]_once) This is the result, not a working patch, more just a concept. http://docs.akbkhome.com/simple_cache.patch.txtThe idea being that assuming most people dont change the include path that often (probably only once when the app starts), then caching the strings that get sent to [require|include][_once] and testing them before doing any file operations could basically kill this kind of talk. The concept and code are simple enough that it shouldnt have too many knock on effects, and shouldnt use up too many resources to save a few open()'s.. The question is though, is if this is really an issue or just the impression of an issue....
Thursday 17 March 2005
My new pet hate about PHP5 is currently the rather stupid warning: "Strict Standards: is_a(): Deprecated. Please use the instanceof operator in" Why on earth is that in there? instance of requires that the class or interface you are testing against exists. That means loading code that may not actually be used if you are using negative testing. That shouts out ineffeciency, and doesnt really give you and readibility or particularly major gains in terms of code doing the testing for you.
It's about the only warning that PHP4 code emits when running under E_STRICT if you disable it when loading the code. here is the simple patch to get rid of this crazyness.. --- zend_builtin_functions.c 1 Feb 2005 19:05:56 -0000 1.256 +++ zend_builtin_functions.c 17 Mar 2005 08:20:42 -0000 @@ -672,7 +672,6 @@ Returns true if the object is of this class or has this class as one of its parents */ ZEND_FUNCTION(is_a) { - zend_error(E_STRICT, "is_a(): Deprecated. Please use the instanceof operator"); is_a_impl(INTERNAL_FUNCTION_PARAM_PASSTHRU, 0); } /* }}} */
Saturday 5 March 2005
As part of the QA for the first release of DBDO, I'm migrating my current website from PHP4 to PHP5, while the original code ran on PHP5 with only a few minor changes (adding clone() to a few locations), Obviously there was more that could/should be done to it. - migrate DB_DataObject code to DBDO - which consists of.
- add the DBDO::config() lines
- change DB_DataObject::factory calls to DBDO::factory, and add the database alias as the first arg.
- change find() and find(true) to query() and query();+fetch()
- comment out the bits I havent finished yet (like escape().. - which is pretty critical)
- replace XML_Tree with php's DOM...
The navigation code on my site uses a HTML file which is just a simple <UL> etc. and is just chopped up, a few extra tags added, then rendered with CSS when you view a page.. It was done as a proof of concept, as I really liked the single file nav concept (based off of Paul Wiki thing), but using wiki/just text and those dumb wiki DancingCaps names everywhere just sucked bigtime. I also rather liked the idea of modifying the thing in a HTML Editor (until I could be bothered creating a real editor), so that's how the current idea came about. To do the rendering / url rewriting, the file is passed and modified by XML_Tree and a few node iterators (good ole function calls and foreach(array_keys($node->children) as $i) ...... I've still not been convinced that PHP5 Iterators had anything other than being 'cool', I suspect they will make code less readable, and more magic. After having done quite a bit DOM with Javascript, I made the decision to replace the XML_Tree code, with DOM, It turned out however that Javascript Implements DOM++, and PHP only impliments DOM... The biggest difference between PHP and Javascript's DOM is that Javascript has effectivly decided to implement alot of SimpleXML's features within the DOM model. These are the differences, which I find more than a little annoying in PHP. Fetching Children: Javascript: child = node.childNodes[12]; PHP: $child = $node->childNodes->item(12);
Fetching Attribute value Javascript: href = node.attributes['href'] PHP: $href = $node->getAttribute('href');
Setting Attribute value
Javascript: node.attributes['href'] = 'somevalue'; PHP: $node->getAttribute('href','somevalue');
There are a few other things that would be nice, that both miss out on. Appending Elements
Javascript: node = doc.createElement("span"); parent_node.appendNode(node); PHP: $node = $document->createElement("span"); $parent_node->appendNode($node);
PHP 'Natrual Way': $node = new DOMElement("span"); // AFAIK this may work. $node->childNodes[] = $node; //
From what I remember you can convert a simpleXML document to a DOM document, but It would be far better if DOM just implemented a few of SimpleXML Features.. PHP should really be about clarity, simplicity, and getting things done.. DOM is pretty close, but could really do with pushing it the last mile..
|