Smoking toooooo much PHP
Sunday 30 January 2005
There were some interesting comments from my last post on Hyping by blog. Jackson Miller pointed out very bluntly, isnt that what blogs are for, and while he is partly correct, if a blogs are purely a advertising / news feed for a project, then they are not really blogs, but more project sites. What makes blogs interesting normally, is not that they publish release announcements, but that you get some insight into other things a developer may be doing, often unrelated to a project they are well known for. So I guess the conclusion was, if some one starts hinting you are Hyping and not blogging, perhaps you are.. I saw a few posts on Artima more recently, that included detailed analysis on RoR indicating alot of what I considered RoR to be. Ruby, while having interesting features, doesnt appear to have what could be called an elegant language construct, or a particularly huge following, which for me are part of the consideration on whether to invest time into experimenting with it, (which C#/ASP.net did justify, but produced similar returns). RoR, turns out to be little more that a clever combination of tools to write skeletons and some reasonable libraries, which while useful, really doesnt justify the excitement, but I gues it's an improvement on the ASP.NET, where the solution is not forced in your face so much, and alternatives are frowned apon (try googling for the equivilant of mysql_escape_string in .NET, and you will see what I mean) Imap Continued.Bincimap unfortunatly was unable to deliver the promise that it looked like it could. This week I got a call saying that Outlook express (or more like 'lookout express') users where having problems. It's pretty common knowledge that although outlook says it supports IMAP, it's implementation is buggy to the point of unusable. I know this from googling mailing lists and seeing the amount of kludges and workarounds that appear to have gone into IMAP servers, just to support this pile of crap. Normally when you get problems with outlook and imap, you brush it off as intermittenant problems with a crap piece of software, and suggest they upgrade to a real email client (thunderbird, or evolution come to mind). But sometimes, company owners or important sales staff are not really that open to changing the ill gotten ways, so Outlook support has to be suffered.. (at least at an hourly rate!!!). So this time (after a few goes at modifying the settings on outlook) I decided to examine what was going on a little closer, including doing protocol dumps. The key problems where that deleted messages (and ones that had been moved to another folder) would reappear as unread, new when you pressed the send/recieve. To my amazement, outlook spawns new connections and does alot of imap operations concurrently, without a care in the world on how complex this may be to the server (eg. 3 connections all doing operations on the inbox folder). And menu operations often open new connections, and drag and drop operations dont. - It's all a bit like a beginners VB program, completely undesigned, and thrown together a few minutes after hello world worked. I've given the protocol dumps to the bincimap developers, but over the weekend, I also discovered that my wife's palmphone, was unable to read email. I can pospone problems with companies a few days, but I better fix my wife's issues faster!. So after another marathon protocol dumping sessions, it became clear that bincimap was sending a little too much information for snappermail to understand. So I quickly switched over to dovecot imap. I feel a bit disapointed here, us fickle users, jump from one ship to another so easily. I did get the chance to look at bincimap's souce, and it was very clean C++, and pretty well designed. And having given the author (Andreas Aardal Hanssen ,who was very responsive) a reasonably high quality set of bug reports, I didnt feel to bad deserting to another application. Dovecot on debian proved amazingly simple to migrate to, the only change required after apt-getting was modifying /etc/dovecot/dovecot.conf and changing the line protocol = imaps Other than that, restarting evolution, which should provide another good blog review, and I have now finally tested, used and configured all 5 major open source imap servers..
Sunday 23 January 2005
Alot of people started using blogs as a slightly better media for technical information, but it's becomming evident that with subjects ruby on rails, and some of MS astroturfing with marketing material, that blog aggregators like Artima have been abused heavily with rather second rate blogging, about 3rd rate tools. While ruby on rails is probably a good tool, it fails in a huge part from the flawed thinking that one provider can deliver a complete solution. It took me a long while to realize that attempting the complete toolkit that ror promises is often fruitless. It rarely delivers much beyhond the intial demostratable examples. What normally happens is that in designing for a single solution (A super fast web interface to databases), you often end up with libraries that are rather poor for generic usage. The fact that PHP already has 3 or 4 projects that are based on PEAR, that deliver pretty much the same solution as ror. Indicates that the concept of small flexible libraries, maintained by seperate individuals, rather than one super mega project is always more valuable, although I guess you miss out on the hype more.
Thursday 20 January 2005
mbox must be about the worst designed format ever, this week in a small office I consult for, a few of the staff started complaining they couldn't open some of their mailboxes. It didnt take long to realize that the server was overloaded. 10 people, each had an inbox averaging 500Mb, and outlook checking email every 3 minutes, a few of them also doing a full scan of their imap folders checking for new mail, which range from 2->4Gb. The poor server was suffering badly, so yet again, I investigated IMAP servers. I've tried cyrus, courier, uw-imap and while each has advantages cyrus and courier have tended to be a little annoying to set up, messing around with auth and protocol issues. uw-imap is the root cause of the above issues (although mbx format does help alot). I was interested to find bincimap (binc is not courier). The overview points out that it's a pure maildir backend mail server (which usually perform pretty well with a cache, and dont have interface issues with folders containing folders not being usable). It also appeared to be pretty simple to set up, although no examples for use with exim where obvious. I tested the installation on my development box, and after a bit of hunting around and guesswork, I put together simple instructions for converting a exim4/uw-imap installation to exim4/bincimap. The only downside to the conversion was that I lost all my "important" flags from thunderbird. I was actually quite impressed that my own instructions where so amazingly simple. mono/ASP.net and C#
My experiments with ASP.net and C# have been contining, highlights of this week where discovering that codebehind, without VS.NET is a real waste of time, the codebehind concept assumes that you want to have a compiled .dll. So if you are developing a web page with codebehind, your roundtrip testing becomes edit/build/(install)/test.. I may as well write it in C!!!, at least you can compile that on the fly ( tcc a very nice small C compiler). After playing around with various <% language="C#" src="..."> options, I eventualy came up with the kludge of doing quasi virtual includes <!-- #include "lib.cs" --> It's far from clean (it feels a bit like working with function libraries), but at least it works. Next on my challenges was getting ASP.Net working with mysql, This is a challenge in it'self (I ended up copying the bytefx.dll into my web root's bin directory to get the import working). From what I gather, ByteFX who wrote one of the main mysql connection toolkits for .NET, (which looks like it is now owned by mysqlAB) is unfortunatly not very well documented. The example on the go-mono.org site works, but It did not take long to be reminded that if you want to work with C#, you have to think the Microsoft way. The one true solution, or the one true solution (both of which look good on the face of it, but are shit round the edges.) Any good PHP programmer knows that sending raw data from a URL to the database is a security nightmare waiting to happen. So we have these wonderfull features like addslashes, and mysql_real_escape_string(). - normally hidden nicely in a DB abstaction layer. We also get to use bound parameters in some database backends that are designed that way. The .NET way is take parameters or give up.. - you cant escape strings, you must use parameters. ( excuse my memory here - this example may need fixing)eg. mycmd.executeText = "select * from sometable where name=@name"; MysqlParameter param = new MysqlParameter("@name", MysqlType.VarChar);param.Value = "some'test"; mycmd.Parameters.Add(param); A couple of problems with the above code come to mind, (apart from it's suffers the .NET problem, adding as much noise as possible to a piece of code, trying hiding the purpose).- MysqlParameter docs are difficult to find, and mostly in javadoc style format - which not exactly informative.
- @ is used by mysql for variables, so apparently the @ will get changed to ? later...
- Debugging what is actually going to the server is impossible! (as far as I could tell). I ended up turning debugging on , on the server, just to be sure what data was ending up at the server.
I can't say it's all bad, but it get very difficult to see the gem's between the rocks when you spend your time writing simple methods to solve common problems all the time. Client ever asked you to turn a HTML table into a excel spreadsheet?Given one hour to convert a complex piece of data retreival code to output to an excel file. The thought came to me, why not do it in javascript, based on the existing HTML. write a small piece of javascript that iterates through a HTML table, and posts a form with the data as a CSV to a 2 line PHP script. This is the HTML <form method="POST" action="quickexcel.php" onsubmit="return toExcel('data')"> <input type="hidden" id="exceldata" name="exceldata" value=""> <input type="submit" name="_submit" value="Download as Excel"> </form> <table id="data"> ...... table with data ...... </table>
you can see the javascript here (it's pretty simple) and the two line php file.. <?php
header('Content-type: application/vnd.ms-excel'); echo $_POST['exceldata'];
Monday 17 January 2005
After quite a few email conversations about HTML_Template_Flexy, I finally got round to documenting one of the javascript libraries I've modified. This is still a bit buggy around the edges, but is pretty much working. Javascript Calendar started off with a few hacks around the dynCalendar on pear.php.net, and evolved into nice clean simple Calender renderer. The main design criteria for this was that it should be independant of the HTML page, not requiring any specific javascript, or changing the backend Template code. Just include the javascript file, set the style, and add a few extra attributes to the input elements.
Wednesday 5 January 2005
First Impressions matter, and today, I started with a blank piece of paper, and tried to port a very simple piece of code from PHP to aspx. While you cant judge a book by it's cover, from today's experience, even reading the first few pages, was enough to make me wonder what all the fuss is with C# and ASPX, it appeared to be a poor joke in comparison to PHP. I've hacked at C# projects before, having looked very heavily at mcs, and phpLex (a modification of csLex to generate php code). But I'd never really had any need to look at aspx. Today was a little different, a prototype I had thrown together for a project began being rebuilt in C# (dont ask why...) The first part of the project was to get aspx + mono up and going on the debian box I was using. In general this was a matter of firing up synaptic, searching for asp.net and selecting install, and apply. There was a small issue that the default debian packages do not create a /var/www/.{cant remember the name} directory, and give it write permissions. I eventually spotted an error by starting xsp.exe manually with the verbose flag. (It would be better if xsp.exe install on debian defaulted to logging verbosely to /var/log/xsp.log)Some of the impressions I got where from things that mono could improve on, others were a little more fundimental in terms of what at present appears to be questionable philosphy in C#, and maybe a lack of experience with aspx. Having solved the missing directory issue, I went on to try out all the demo's that came with it, alot of these demostrate the web controls, but in general are good enough for a quick feel for the language. I was a little perturbed by the examples of web controls, the first thing that came to mind was the template files which include <asp:input name="...."> tags would never render on a browser. It would be near impossible to send a file over to a designer and let them focus on the layout. I did like the way that the controls where available as objects (almost like DOM) to the page controller class, but there was a sense that the the base html template had to be altered far too much to enable this interaction with aspx. Ontop of this, although I'm sure it possible to avoid, I got the sense that you where running a HTML page, rather than an application that happened to render HTML. (most PHP I've done in the last year treats the HTML as a skin available to the application, and is nowhere near the page starting/running process). The example also showed only crude examples of including other files, so I'm left pondering if something as simple as a conditional include is even feasible. But as I got my head around the task at hand, I began to see the other unusual decissions that appear to hamper productivity while developing in the language. Class's for everything, looks like an ideal for OOP purists, but in reality, it turns simple tasks into a challenge of huge proportions, locating methods of unknown objects. The first task I set myself was sending a P3P header. Which I rather detoured into testing out sending a location header. A little googling / digging on the MSDN pages, and I discovered Response.sendHeaders(key,value). I tested this out on the page with the key=location, and tried to redirect to google.. - nothing happened.. (the browser go the header, but never redirected to anything).. A little further hunting revealed a Response.redirect() method. There was just something smelly about the idea that what should be simple, was made complex by an attempt to make it simple. The next issue was dealing with cookies, It goes without saying that you need to know two things in PHP to deal with cookies - $_COOKIE (reading) and setcookie() (writing), in C#/aspx I had to deal with creating 2 objects to retrieve a potential cookie, I had to check it was not null, rather than just see if it was false (no native casting of null => false - eg. if (!mycookie) { .... }). and having begain to understand that I needed to use Request.cookies, I made a simple mistake of trying to use Request.cookies to save my new cookie. Again, the complexity of having Response and Request (which when skimming documentation, the difference is easily overlooked), appeared to be another case of trying to make something too simple, and yet adding complexity. The one that really finished of the effort was looking at md5() - a nice PHP function that returns a string md5 representation of another string. While C# obviously has this, the amount of work required to do what should be a simple task appeared painful at best. Two different classes where needed , one to convert the string to a format that the md5() understood, the md5 one, then a foreach loop to read the bytes returned and convert them back to a nice string. Along the way I also noticed a few other counter intuative behaviours, (along with one or two slightly cuter features). - Page exceptions without clear backtracing, and code reveal. (I think) when you try to assign values to asp form elements, that are not strings - I only got a exception, with little explaination..
- Full Page Code reveals for some exceptions, with highlighting of the offending code. (something PHP turned off by default about 2 years ago due to security concerns). - Although it help alot to have it, I'd consider it a bigger problem in the wider picture.
- Vague and often meaningless error messages. I decided to write a small function/method to do the md5 stuff, and defined the method
public String qmd5(String in) {.....} I kept getting compile errors on the constructor so assumed it was something to do with String using the wrong case. - I eventually guessed that in might be a reserved word or something, and changed it to inStr and solved it..
The sense I walked away with today was that given this small taster, if scaled up to a larger application would mean that I would end up typing numerous variable types, and be using long object.method names for simple tasks continually, (or constantly writing simple methods to do obvious tasks. ). Constantly be fixing exceptions on typos, and trying to second guess the compiler. Probably taking up 3-4 times as much code, and not gaining any clarity of purpose in the code, compared to a project done in PHP..
Sunday 2 January 2005
Just before Christmas, I ran into another roadblock with DBDO. one of those problems that you know isn't going to be fixed quickly, so you pospone it until you really have a bit of free time to solve it.
This particular issue was partly brought on by trying to implement sleep/serialization, but had cropped up before and I tried to ignore it - however sleep/serialization basicaly forced it to be addressed.
A 'post-query' DBDO objects represents an row of the database, if you modify the values, it is supposed to store the seperatly so that it knows you modified them. Hence when you update an object, it can work out what has changed an build a nice effecient query.. This makes the read_property and properties_get methods a little complex (as it has to decide if you asked for a property which is returned from the database or one you assigned afterwards.)
Unfortunatly, the read and write access to object properties in PHP's internals is a little haphazard. (funnily enough it appears to violate all data encapsulation ideas that OO principles are supposed to encourage - and that DBDO's direct access usage also breaks.)
A normal PHP object instance stores all it's properties in a single hash table. Access for writing and reading this hash is not strictly enforced (eg. by always using get/set properties on the object) , and many parts of PHP's internals write and read directly from it.
This has been causes all sorts of problems when attempting to decide if something has been changed, and hence should be updated.
functions like print_r() access the access the get_properties method of the object, however it also appears it might assign the objects properties to the returned value... however serialization calls the __sleep method, then access the property hash directly, rather than using either the objects property_get or get_properties internal methods.
Add to this, the uncertainty that zvals returned from get_properties may not actually be free'd or dtor'd (so as far as I could tell, i have to store the return value of get_properties and hash_destroy it at object destruct time..) makes the whole clever object storage stuff a little more complex that initially envisioned.
But hopefull a first beta should be out by the end of january, when I'll see if I can reduce the 3 hashtables that a DBDO object stores internally a little.
Sunday 2 January 2005
"So should we throw away that old portable", my wife frequently asks if I leave it lying around. An 8yr+ old toshiba PII/300 portable, that has windows 95, supposidly for her use.. but generally ignored due to the fact that it's too slow to fart. It's a perfectly good machine except for the speed and the battery life is now about 2 seconds. So I occasionally embark on a hunt for a nice small linux distribution on a live cd, that would run a desktop via XDMCP. Unfortunatly I've yet to succeed.. - Todays efforts included dsl ( Damn Small linux) ~50Mb - boots up the machine, finds all the hardware and network perfectly (including a wireless card). Has a reasonably usefull looking desktop for browsing the web. But unfortunatly uses tinyx compiled without XDMCP support. PXES ~11Mb - a 'network ready' really thin client boots with a nice option to specify XDMCP, but totally fails to start the pcmcia cards and hence the network (kind of necessary for XDMCP!) So unless someone suggests an answer (as I dont really want to get into the live cd building business), the box will go back to annoying my wife, being left around the house.. And on a slightly more successfull note, I am still realing from amazement at how well debian linux runs on a Sun Enterprise 2 machine a friend of mine had lying around his office - a more than 6year old machine, that apart from the crappy display card performs about the same speed as my 3year old development Intel box. (and has pretty much the same software installed thanks to debian)
|