Archive for July, 2007

Thoughts on the Tao of Programming

Tuesday, July 31st, 2007

I recently came across The Tao of Programming, while humorous it also features some ideas which I believe programmers should listen to and implement into the way they think.

The page is split into several sections ranging from design, to code and even onto management and corporate thinking. It is well worth a full read. I have compiled a few notes on the subject, mainly on how I think key sections provide a great insight into the programming mindset.

On Design, Code and Management

Thus spake the master programmer:
“A well-written program is its own heaven; a poorly-written program is its own hell.”

I think this one is self explanatory. The better you write your code, the easier it is to add to and modify.

A manager went to the master programmer and showed him the requirements document for a new application. The manager asked the master:
“How long will it take to design this system if I assign five programmers to it?”
“It will take one year,” said the master promptly.
“But we need this system immediately or even sooner! How long will it take if I assign ten programmers to it?”
The master programmer frowned. “In that case, it will take two years.”
“And what if I assign a hundred programmers to it?”
The master programmer shrugged. “Then the design will never be completed,” he said

Now this is very true, the more people you assign to a project the longer it takes. This is because people debate and change their minds and then you end up getting no where. It is far better to have small team and to assign each member their own section of the system to design/code. Set up a programming standard that everyone can follow and you end up with maintainable code that works and of course accountability, if a section of the system doesn’t work it is clear which programmer worked on it.

There once was a master programmer who wrote unstructured programs. A novice programmer, seeking to imitate him, also began to write unstructured programs. When the novice asked the master to evaluate his progress, the master criticized him for writing unstructured programs, saying, “What is appropriate for the master is not appropriate for the novice. You must understand the Tao before transcending structure.’

This confused me at first but as I read the rest of the page all became clear, the message here is not that the master writes unstructured programs, but the that the master understands the code to the extent where structure doesn’t matter, I’m sure many of you have your own coding styles and idiosyncrasies that make it difficult for others to understand your programming logic. The novice however must obey all the rules before transcending them.

A manager went to his programmers and told them: “As regards to your work hours: you are going to have to come in at nine in the morning and leave at five in the afternoon.” At this, all of them became angry and several resigned on the spot.

So the manager said: “All right, in that case you may set your own working hours, as long as you finish your projects on schedule.” The programmers, now satisfied, began to come in at noon and work to the wee hours of the morning.

I think this is a perfect way of describing the mind of many programmers. I know for me it is perfectly true, any software I create, I cherish, and it isn’t work when I work on it. And I am sure for many of you that your programs would be labours of love. In that sense working on something for more than 12 hours a day is nothing extreme. This section also gives a great sight into the anarchism that lurks in the mind of many programmers, most don’t like corporate structure and hate the 9-5 view of working life. By allow the programmers to be flexable with their time it gives them more freedom and thus they are better able to master the tao.

And I shall leave you with this last one:

A novice asked the Master: “Here is a programmer that never designs, documents or tests his programs. Yet all who know him consider him one of the best programmers in the world. Why is this?”

The Master replies: “That programmer has mastered the Tao. He has gone beyond the need for design; he does not become angry when the system crashes, but accepts the universe without concern. He has gone beyond the need for documentation; he no longer cares if anyone else sees his code. He has gone beyond the need for testing; each of his programs are perfect within themselves, serene and elegant, their purpose self-evident. Truly, he has entered the mystery of Tao.”

How to Create an XML Dialect - Part 2 - Document Type Definitions and Schema

Monday, July 30th, 2007

Last time we discussed how to design an XML dialect this time I want to focus on how to get the design of your language down in a formal manner.

At the moment there are several ways of formalising the structure of your markup language, I am going to discuss two of them, DTD and Schema. There are others but DTD is the defacto standard and Schema is the W3C’s standard.

DTD

I can do no better than quote w3schools so:

A Document Type Definition defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes.

It has a very basic structure and has a few drawbacks which we will discuss later, but the main advantage of DTD is that there are several very good validators around to help you check any documents you produce in your language. These validators check your XMLagainst your DTD and help spot any errors, this is very useful and help prevent errors in parsing.

Schema

Schema is supposed to be the standard, but like all standards it hasn’t been widely accepted or used yet, most applications have yet to be updated to include it and I have yet to find a validator for checking documents against it. In any case, it is still a good idea to produce a schema document for your language as schema allows you to go into more detail and so you will have a better, more rounded view of your language.

How do I write a DTD and a Schema?

I can do no better than to refer you to w3school’s DTD Tutorial and their Schema Tutorial. Now that you have your main design you can walk through the tutorials and produce an outline document.

You can then validate your DTD using the W3C’s Validator

Examples

Below are the DTD and Schema documents I produced for FFML.

So there we have it, hopefully by now you should have a design and have a formulised structure. The next step is up to you, you can now do whatever you want with your language, you can use it in applications, style it and put it on the web you can even just leave it and start on another. Please let me know if you have created your own markup language and how it went. Did you do anything differently? I would love to know your own processes.

Resource Time: How to Markup Hyperlinks

Sunday, July 29th, 2007

I am across a very interesting article the other day Monday By Noon’s “Click Here to Read this Article” it provides a very good explanation of why you should pick your anchor text correctly. While it is old the principles still apply and as I am sure you have seen , the practice of bad linking described in the article is all too present, even in todays more standards aware environment.

There are many other great articles on Monday By Noon, and I would encourage you to have a good look through all of them. And as always, the link will be available in the iDevs Del.icio.us page.

<acronym> VS <abbr> - The Final Word

Saturday, July 28th, 2007

For many years now many developers like myself have been confused by the grey area surrounding the use of <acronym> and <abbr>. Even the specification isn’t entirely clear on the issue.

The truth is that when asked, most people have no idea on the difference between an abbreviation,, an acronym and an initialism.

Even the documents outlining what <abbr> and <acronym> do confuse the two:

ABBR:
Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.).
ACRONYM:
Indicates an acronym (e.g., WAC, radar, etc.).

Can you spot the mistakes?? Here is a nice guide to help you, maybe the document authors will take a look at it sometime.

An abbreviation is a word or phrase that has been shortened. All initialisms and acronyms are types of abbreviations.

An initialism is an abbreviation that has been shorted using the first letter of each term e.g. XHTML. Unfortunately and slightly ironically, XHTML doesn’t have an <initialism> tag, so current standard is to use the general <abbr>

An acronym is a type of initialism that can be said as a whole word e.g. Laser or Scuba. The <acronym> tag should be used for these.

When in doubt use and remember that to qualify as an acronym the term should be an initialism and you should be able to pronounce it as a whole word.

How to write Semantic Markup

Friday, July 27th, 2007

When building a website it is sometimes all too easy to forget that the main thing you are doing is marking up text, many developers use heading and paragraph tags but with the rise of CSS many developers forget about the semantic elements of markup.

Semantic tags are used to markup text that has a meaning other than its content. For example, if I enclose text in the <code> tag it demonstrates that that text is a example of source code for a computer program.

This not only has an effect on the display of the text, it also informs search spiders and user agents that the piece of text you have marked up contains computer code, this means that they could perform other actions on it e.g. place it in a special index for programmers or inform the user that the site contains code snippets, this could be a useful feature if the browser was built into an IDE.

Here is a list of inline semantic tags that may be useful to you when marking up documents in the future:

  • cite - Demarcate a source citation
  • code - Demarcate a code snippet
  • dfn - Indicate a term is defined in the current location
  • em - Demarcate that text should be emphasised
  • kbd - Indicate keyboard input
  • samp - Indicate that the contents reflect sample output, as from software
  • strong - Indicate that text should have a strong emphasis
  • sub - Indicate text should be rendered as subscript
  • sup - Indicate text should be rendered as superscript
  • var - Demarcate text as a variable name

I hope the above list will help you see how you can use markup to convey not only the text but also its meaning

How to Create Effective Metadata

Thursday, July 26th, 2007

Creating effective metadata can be difficult. How do you know when to use metadata and when not to use metadata? and is meta-metadata ever a good thing?

To Meta or Not to Meta??

Imagine a picture, for arguments sake lets say it is of a beach, at sunset, the beach is rocky and the tide is just starting to turn. *Click*, right now that that is on the digital camera lets take it back to the computer and start tinkering with the data.

Now in a typical situation the camera records some metadata when you take the picture, usually the time and date, the model and serial number of the camera is usually also recorded. If the camera supports it you might even have a longitude and latitude of where the camera was when the picture was taken. This is all very well, but apart from the date and time the rest of the cameras automatic metadata is useless in a data warehouse scenario.

If I search for this image I want to be able to look up “rocky beach” or “sunset” or maybe something like “inspirational photography, I might even want to find “pictures taken by Jamie”. This is all metadata which must be added by humans, in the web 2.0 world they are called “tags”. The same strategy applies to any data.

There are obviously limits to the metadata you should add, e.g. the tag “sun” may be a bit too general and of no real use since the majority of people opt for a long tailed search strategy.

So, if we take my example image what metadata could we come up with?

  • Sunset
  • Beach-Rocky
  • Photographer-Jamie
  • Exact Location - Date and Time Note: These would not be considered tags, but they are useful when organising data

You may have noticed that I conventionally tag in the form General-Expanded e.g. Beach-Type. This isn’t the only convention, others prefer to tag full descriptions e.g. RockyBeach or Rocky-Beach. Because metadata is subjective it depends entirely on how you think the information should be represented. I would of course recommend my approach as it allows for structured parsing and means that data can easily be categorised e.g. Beach ยป Rocky

Meta-MetaData?

Meta-Metadata is data about metadata. This may seem like a lot of redundancy but it can actually be quite useful, for example if someone in my data collection tagged a photograph as Beach-Bed this would contradict my metadata design i.e. you cannot have a beach made of beds or at least a natural one. Therefore I may want to create a file that details acceptable expanded definitions of the Beach tag.

Resource Time: Structure and Semantics on the Web?

Wednesday, July 25th, 2007

I cam across a great introduction to the structured web, I would encourage you all to have a read, the quality og the series so far has been excellent.

While on my travels I also came across a good introduction to the semantic web, while it is slightly outdated a lot of the stuff being talked about is still relevant today.

So enjoy those two links, I will also be placing them on the iDevs del.icio.us page for future reference.

Ikao - Humor for Code Monkeys

Tuesday, July 24th, 2007

Dan and I have recently launched Ikao. It is a web comic aimed at code monkeys like ourselves and it details some of our thoughts and feelings towards certain topics. And because I can, here is a quick sneak peek at Thursdays upcoming comic. Warning - this comic is intended for people with a sense of humor and some themes may be unsuitable for young children or mature audiences.

Ikao - Converting C into Perl

How to Create an XML Dialect - Part 1 - Design and Draft

Tuesday, July 24th, 2007

XML, is a great basis for creating a markup language, it is structured but allows for complete freedom in how everything is arranged, it also allows you to incorporate existing languages using namespaces, but that is for further on in this series.

The first thing you should do is be accustomed to using XML, there are several great tutorials on the net that go about teaching the basic of the markup. Got that? Great we will move on.

Now, you should have an idea about what you want to markup, it could be data about anything cars, planes, books, cutlery, crisps…you get the idea. For this series I am going to be creating a language to markup contact details of friends and family, I am going to call it FFML.

Stage One - Design

The first big step is to get down on paper exactly what data you need to markup, coming up is a list of everything I could think of, but don’t forget we can easily extend this later on if anything else comes up:

  • Name
  • Phone Number(s)
  • Address(s)
  • Birthday
  • Email(s)
  • My relationship to them

Stage Two - Expand

Ok, so we have a basic list, what now? Well first we need to go through our lists and see what can be expanded e.g. Name can become title, first name, middle names and surname. This will help us come up with a tag structure later on.

So after some thought this is the list I came up with:

  • Title
  • First Name
  • Middle Name(s)
  • Surname
  • House Number / House Name
  • Street Address
  • Locality
  • Town / City
  • Country
  • Phone Number
  • Birthday
  • Email(s)
  • Relation

Stage Three - Tags and Attributes

Now as you can see some have been expanded quite a bit while others haven’t been at all. The next step is to think about which of these pieces of data can become elements and which become attributes. For example it would be confusing if the language had two elements for house name and house number it makes more sense to have a house elements and to have a type attribute which will allow name or number to be entered.

Below shows my completed element list with included attributes:

  • <title> - Optional
  • <firstname> -Required
  • <surname>- Required
  • <middlename> - Optional
  • <house type=”name/number”> -Optional
  • <street> - Optional
  • <locality> - Optional
  • <country> - Optional
  • <number type=”house/mobile/work”> - Optional
  • <email type=”personal/school/work/”> - Optional
  • <date type=”birthday/anniversary/other”> - Optional
  • <relation to=”me/friend”> - Optional

Stage Four - Group

To finish of this basic draft the last thing we need to do is create sections in the language, for example, <title>,<firstname>,<surname> and <middlename> can be placed inside a parent <name> tag. Following is an indented list of all the tags and their parents.

  • <name>
    • <title>
    • <firstname>
    • <surname>
    • <middlename>
  • <address>
    • <house>
    • <street>
    • <locality>
    • <postcode>
    • <country>
  • <contact>
    • <phone>
    • <email>
  • <personal>
    • <date>
    • <relation>

Don’t forget that we also need enclosing tags for the language, <ffml></ffml>. And each friend or family member should have their own entry, we will surround each one in a set of <entry> tags. So a basic FFML document might look like this:

<ffml>
      <entry>
           <name>
                   <firstname>Jamie</firstname>
                   <surname>Lewis</surname>
           </name>
      </entry>
      <entry>
           <name>
                   <firstname>John</firstname>
                   <surname>Smith</surname>
           </name>
      </entry>
</ffml>

Stage Five - Rest

And there you have I, our first draft of a language! The best thing you can do now is to put it aside somewhere and to come back to it in just under a weeks time to check it over and potentially spot anything you need to add, or see if you can move some things into another subset.

Next time: We will look at developing a DTD for our language so that it can be validated!

3 Reasons We Should All Start Marking Up Files Semantically

Monday, July 23rd, 2007

I love semantics! Now, heres a question, how did I just say that, was I being sarcastic? joyful? Was I shouting? All will be revealed as you read the three reasons we should all start marking up files semantically.

  1. Better Computer Understanding - Imagine a README file with sections marked up to relate to different software functions, combine that with software that has been programmed to read the file and follow user actions and you could have a very powerful user interface.
  2. Easier Search - Computers love structure, so the more structured a file the easier it is for a computer to understand, context and meaning can be applied and so the document makes more sense and therefore is easier to search.
  3. Accessibility - Ever tried to sound sarcastic over the internet, it is impossible to do through plain text, however if i wrapped my text in “<sarcasm>” tags the context becomes apparent. Now obviously tags in communicative text become messy so we would reserve the tags for audio output or display the text in some other way e.g. bolded for a strong emphasis.

So as you can see I was probably joyfully shouting I Love Semantics! and now as you hopefully can see, marking up works best.