The factors that led them to choose IE..

Posted over 9 years back at Ryan Tomayko's Writings

News.com.com.com is reporting that Firefox is gaining on IE faster than expected. Amsterdam based OneStat.com has IE’s market share as low as 88.9%. I can’t help but wonder if those guys didn’t hit the hookah a few too many times before running the numbers. The Mozilla Organization has been saying that they hope to have 10% of the market by the end of 2005. You could project that they might reach that by the end of 2004 if these OneNet stats are accurate, which they most probably are not.

Sigh

At any rate, check out this gem from Microsoft’s director of product management for Windows, Gary Schare (pronounced Gair-ee Share-ee ;)

“I still believe in the end that most users will decide that IE is the best choice when they take into account all the factors that led them to choose IE in the first place,” Schare said. “Meanwhile, we’re happy that they’re primarily (using Firefox) on Windows, and that Firefox is part of the large ecosystem of software products available on the Windows platform.”

The “factors” he references are covered briefly here, while more on the “Windows ecosystem” he mentions can be found here.


Adam Bosworth, Sloppy KISSes, and WS-Mess

Posted over 9 years back at Ryan Tomayko's Writings

About two months ago, I linked to a tiny little paragraph Adam Bosworth wrote at the end of a completely unrelated weblog entry, where he mentions that he had been trying to justify all of the WS-Complexity when simple XML over HTTP works so well. People have been proposing that simple XML over HTTP hits the 80/20 for awhile and it’s beginning to catch on but today might have been a watershed event for the Loyal WS-Opposition. Adam evidently thought about this stuff really hard over the past two months and has just published the transcript of a brilliant talk he gave at ISCOC04 where he emphasizes simplicity and organicness over complexity and cathedral building in the Web Services space. Herewith some notes and speculation on What It All Might Mean.

What makes this talk so special?

This talk is about this conflict as it relates to computing on the Internet. This talk is also a polemic in support of KISS. As such it is unfair, opinionated, and perhaps even unconscionable. Indeed, at times it will verge on a jeremiad.

Well, for starters, Adam is a complete bad-ass as is obvious by his use of words like jeremiad, which turns out to mean exactly the kind of thing bad-asses talk about all the time:

jer-e-mi-ad : A literary work or speech expressing a bitter lament or a righteous prophecy of doom.

But seriously, this eWeek article from July 2004 talks about Bosworth leaving his Chief Architect/SVP of development post at BEA for Google and gives some history behind Bosworth’s other adventures in technology. He’s been involved—and is often given credit for—the success of many applications and technological achievements over the past decade or so.

The other reason this is an important event for the REST people, and the KISS/YAGNI people in general, is because Bosworth worked primarily on WS technology when he was at BEA. So not only is he a really smart guy in general but his really smart brain has been cranking away on concepts surrounding Web Services for the past couple of years. And now he just casually plops the following out on his weblog:

On the one hand we have RSS 2.0 or Atom. The documents that are based on these formats are growing like a bay weed. Nobody really cares which one is used because they are largely interoperable. Both are essentially lists of links to content with interesting associated metadata. Both enable a model for capturing reputation, filtering, stand-off annotation, and so on. There was an abortive attempt to impose a rich abstract analytic formality on this community under the aegis of RDF and RSS 1.0. It failed. It failed because it was really too abstract, too formal, and altogether too hard to be useful to the shock troops just trying to get the job done. Instead RSS 2.0 and Atom have prevailed and are used these days to put together talk shows and play lists (podcasting) photo albums (Flickr), schedules for events, lists of interesting content, news, shopping specials, and so on. There is a killer app for it, Blogreaders/RSS Viewers. Anyone can play. It is becoming the easy sloppy lingua franca by which information flows over the web. As it flows, it is filtered, aggregated, extended, and even converted, like water flowing from streams to rivers down to great estuaries. It is something one can get directly using a URL over HTTP. It takes one line of code in most languages to fetch it. It is a world that Google and Yahoo are happily adjusting to, as media centric, as malleable, as flexible and chaotic, and as simple and consumer-focused as they are.

On the other hand we have the world of SOAP and WSDL and XML SCHEMA and WS_ROUTING and WS_POLICY and WS_SECURITY and WS_EVENTING and WS_ADDRESSING and WS_RELIABLEMESSAGING and attempts to formalize rich conversation models. Each spec is thicker and far more complex than the initial XML one. It is a world with which the IT departments of the corporations are profoundly comfortable. It appears to represent ironclad control. It appears to be auditable. It appears to be controllable. If the world of RSS is streams and rivers and estuaries, laden with silt picked up along the way, this is a world of Locks, Concrete Channels, Dams and Pure Water Filters. It is a world for experts, arcane, complex, and esoteric. The code written to process these messages is so early bound that it is precompiled from the WSDL’s and, as many have found, when it doesn’t work, no human can figure out why. The difference between HTTP, with its small number of simple verbs, and this world with its innumerable layers which must be composed together in Byzantine complexity cannot be overstated. It is, in short, a world only IBM and MSFT could love. And they do.

What does that mean? I mean, other than the obvious things he’s saying like simple is better than complex. What did he just say in that last sentence? Did he just say that IBM and Microsoft, the two biggest contributors to WS-Madness, stand to gain significantly from making things require complex toolkits as well as certified experts? I think he did.

In fact, that’s what really pisses me off more than anything about the whole WS-Situation. I've never really been able to put my finger on it but I think that he just nailed it for me. When the very first SOAP specs were being published five or six years ago, it was extremely simple and light-weight and was more of a concept than a specification. It was all about “hey, why don’t you expose that customer record as XML over HTTP and then I don’t need access to your database and we won’t have to mess with CORBA and,” pause/think “well, shit! If we slap some SSL on that pipe we could even do this over the public internet.” At some point, the wrong people got involved and turned these simple ideas into another piece of massive complexity and it became a tool for vendor lock-in.

Real quick, I want to make sure I'm not giving the impression that Bosworth was some kind of WS-Nazi and suddenly saw the light in the REST architectural style, joined Google and is now working off the evil points he earned at Microsoft to meet some kind of not-evil quota required by Google; that’s not the case. In fact, I believe he was one of the first people to really champion loosely coupled, late bound, message-based SOAP web services as opposed to tightly coupled, early bound, RPC style web services. But today was the first time I've seen him go so far as to state publicly that the WS stack probably isn’t going to work in a large number of scenarios.

I think we just need to get more enterprise developers hanging out on the public web and seeing what kind of things are possible with a simple set of semi-standard protocols and formats. Bosworth had to leave BEA (enterprise) for Google (web) before he could recognize and think objectively about the value of simple concepts like REST and loosely specified XML messages.

Splice

Posted over 9 years back at Ryan Tomayko's Writings

I've been working on a weblog/micro-content management system with what I believe are some unique qualities. I've wanted to write about some of the approaches I've taken and how they are (and are not) working out but feel I should provide some kind of context for my ramblings. So I'm going to try to summarize the main aspects of the system real quick so I can start digging in to the more specific stuff.

I've settled on the name “Splice” because I think it has a nice ring to it and is a pretty good one-word description of a major goal of the platform. Nothing has been officially released yet but I have allocated a project on sourceforge.net and plan on bombing the existing code up there within the next week. An initial 0.1 release should follow after about a month or so, assuming I can keep my current pace.

So without further ado, here’s a quick breakdown of the planned features of Splice…

Licensed under the GPL

I'm a firm believer in viral Free Software when I'm not being compensated for development by a corporation, university, government body, or organized-crime division. I also like a lot of the side-effects that resulted from WordPress being GPL, such as contributed plugins, themes, and templates being licensed liberally.

Written in Python

This is actually my third attempt at creating a Python based publishing platform and I think most of the doing-things-the-wrong-way stuff is behind me. There are a few promising Python weblog tools out there today, and I've toyed with most of them. I used pyblosxom for awhile but am not particularly attracted to the blosxom direct file-system storage model. There’s a myriad of Zope based tools but Zope is a little too heavy and high-level for me in most cases. Although, I am looking at using some of the individual pieces of Zope such as ZODB and the TAL templating language.

Database backed.

I find the MoveableType and WordPress models of having a backing database to be superior to the blosxom files-and-directories model for many reasons. I still have not decided whether going the ZODB/Durus route might be beneficial but generally speaking, I think having managed storage is necessary for some of things I'd like to attempt with the system; not the least of which has to do the next item – simple tagging, which doesn’t fit well with the flat files and directories model.

Simple Tagging

I'm convinced that the simple tagging model used by del.icio.us, and more recently Flickr, is an exceptional development for classifying various types of content. I want to milk simple tags for all they’re worth and plan on building quite a bit of functionality with the assumption that simple tags are attached to content. This was a major motivating factor in deciding to take on the project. Attempts at bolting simple tags onto existing publishing platforms doesn’t seem to be fairing well. I think this is something that needs worked in at a fundamental level so that other functionality can be based off of it.

I plan on writing a lot about these concepts in future posts.

Consuming and Exposing Content

I think there is considerable value in having strong integration services in a content management platform. There are services popping up all over the web for managing various pieces of content and I think the weblog is becoming less of an authoring environment and more of a central aggregation point for that content.

For example, del.icio.us provides bookmarking/linklogging services and exposes a REST based API for pulling/posting link content. Many people are integrating their delicious links into their weblog these days. Bloglines provides a web based news-reading environment and exposes your blogroll through a public REST based API. And finally, just about everything else you manage is exposed via the most successful web service of all: RSS. I'd like to provide a framework for plugins that need to synchronize content to and from external services.

Along similar lines, I'd also like to blow out the syndication concept a bit and expose not only feeds but XML representations of each post. I hope to implement Atom format and protocol support. The general idea being that exposing as much as possible through some machine readable format is A Good Thing.


I think that covers most of the distingushing characteristics of the system. I guess I should also mention that I plan on implementing all of the standard weblog functionality such as comments, trackback/pingback, feed auto-discovery, etc.

More to come soon. Thanks for reading and stay tuned..

Java and Open Source

Posted over 9 years back at Ryan Tomayko's Writings

Mark Stone has an article on Newsforge entitled Java and open source where he breaks down the reasons businesses choose to embrace open source licenses, why Sun has not with Java, and why Java developers/vendors shouldn’t care. Herewith some ramblings in support of Mark’s business oriented look at OSS and an issue with his conclussion.

What got me excited about this article was that, throughout most of the text, Mark is talking about issues surrounding open source purely from a business perspective. There’s not enough of that happening in the community. The sooner we work out the business value OSS licenses provide, the sooner OSS will become an accepted and ubiquitous concept in business. Duh, right?

I've met a handful of people that I like to call “business hackers”. They are business people but they are, paradoxically, also complete bad-asses in that they use heavy math, reasoning, and analytical skills to make cold, smart business decisions. If I were to tell one of these guys, “hey, if you license your wares under the GPL, you can build a community and provide customers with their natural liberties,” they would look at me with blank stares and wonder wtf I was rambling on about. If I were to instead say something like, “Well, if you’re really worried about market penetration, you may want to look at the GPL license due to the X, Y, and Z qualities of the license,” they would be much more receptive. Information like that flows nicely into their existing worldview.

I'm going to sidebar into a bad analogy here because I think this point is extremely important. A friend of mine approached me about a year ago and asked if I would be interested in a little business venture. He said that he was considering purchasing a “California race horse” to race against the inferior Ohio race horses at the local tracks. He went on about how the Cali. horses were bred, trained, and cared for much better than the Ohio horses. It was an interesting crash course in horse rearing.

Now, there are some things I care about but horses isn’t one of them. When it comes to putting money on the table, I'm not interested in the damn horses' bloodline. I want to hear about the costs of purchasing, maintaining, and running the horse and how they compare to the horses' earnings. This is the way a lot of business people are seeing OSS. We’re trying to sell ethics and process when the business interest is in results.

Anyway, back to the article. I found Mark’s business model breakdown to be right on. He makes the claim that companies lacking an ethical attachment to OSS generally choose OSS licenses for one of the following business reasons:

  • Market growth
  • Market penetration
  • Market preemption

That is, freely distributable software has a natural advantage over non-freely distributable software in these areas. More companies are accepting this as simple truth and are incorporating that truth into their decision making processes around how new software should be licensed. For example, when market growth is more important than revenues, you may want to consider a license that includes free modification and distribution. It just makes sense; no religion attached.

After Mark takes a look at what is driving Sun’s decisions around Java and how OSS doesn’t really benefit them, he comes to the following conclussion:

Businesses and developers who fret about whether or not Java is or will become open source are missing the point. The free availability and near ubiquity of Java in the enterprise software market means that the open source software being created with Java is much more interesting than the open source status of Java.

I have one issue with this. The main problem with Java not being open source, IMO, is not so much that the source isn’t available/modifiable (which also sucks) but that the package from Sun cannot be freely redistributed. This really hurts taintless GNU/Linux distributions like Fedora and Debian because they cannot provide a Java package with the core distro. If they cannot provide a Java package with the core distro, it is impossible to provide any packages that rely on Java (i.e. “software being created with Java”) as part of the core distro. Until very recently, this meant that writing an app in Java meant that app had no chance of being included with a Free GNU/Linux distribution. The open source software projects created with Java that Mark talks about are at a tremendous disadvantage to projects using C/C++, Perl, Python, Lisp, bash, and any other language with a free compiler that can be packed into a standard distribution.

Fedora started taking Java packages into the core late in the Fedora 2 development cycle. The only reason they were able to do this is because of the excellent gcj and GNU Classpath projects reaching semi-stability. These projects provide a Java compiler/interpreter and an implementation of most of the standard runtime. Both are completely Free – licensed under the GPL (with slight modifications in the case of Classpath). And now that an OSS compiler/runtime exists, there’s a whole boat-load of packages over at jpackage.org that Fedora packagers are scurrying to fit into the core, which means that Java developers will be able target GNU/Linux environments and know that there are a ton of libraries already available and tested. None of this is possible without a freely redistributable Java compiler and runtime.

Got a gun

Posted over 9 years back at Ryan Tomayko's Writings

Vs. Thumbnail I was listening to Pearl Jam’s Vs. album today when the song “Glorified G” came on. I always liked the song but I never thought it had any real meaning. It was eerie how relevant these lyrics feel today:

Got a gun, fact I got two
That’s o.k. man, cuz I love god

That pretty much sums up my narrow understanding of the red state mentality.

Weapons and Coding

Posted over 9 years back at Ryan Tomayko's Writings

My kid brother, Private Jesse D. Fronk, recently joined the US Marine Corp and completed combat training. This is where a bunch of 18 year old kids spend six weeks shredding moving and stationary targets using various projectile, mounted, and hand propelled weaponry including grenades, grenade launchers, hand guns, rifles, and machine guns. He talked a lot about the SAW (big/sometimes-mounted machine gun) and the grenade launcher but when I asked which weapon he would prefer if he were to find himself in a hostile situation where he was unsure of what kind of crap to expect, he replied, “The M16 rifle – hands down.”

I thought this was a bit odd after he had just went on about the pure bliss in blowing things up with the grenade launcher but the reason is simple really: the M16 is the most efficient and productive general purpose tool for producing casualties (or so believes the US military) on the planet. It is light, quick, powerful, reliable, and cheap. However, it is not the only weapon and it would be absurd to suggest that a single weapon could outperform the specialized ones at specific tasks.

Coding is a lot like killing people, insofar as choosing the tool is a bit of a science in itself. In terms of productivity, reliability, maintainability, and performance, the language you choose will likely have more impact than any other decision you make, so it is important that every developer have a good understanding of what languages are available to them and their relative merits. Most coders have a language that plays the same role as the M16 plays for my brother. Unfortunately, most also have strong religious attachments to this One True Language and this is bad because religious attachments to things like programming languages usually results in a lack of objectivity. It’s important that we place ourselves in a position where we are able to constantly weigh our languages and practices against what else is out there.

When Java came on the scene, the industry adopted it very quickly, religious attachments aside. The language and standard library was a huge improvement on what most had been exposed to at the time. The fact that so many saw Java as a better general-purpose language and moved quickly to adopt it is not surprising in the least – it’s exactly what you would expect to happen. Something better came along and you were faced with the choice of either jumping all over it or becoming irrelevant.

The benefits of Java came with some downside though, and there was a lot of nay-saying going on. You couldn’t build drivers that talked to hardware with Java like you could in C, you couldn’t build Windows apps as quickly as you could with VB, you didn’t have high performance templating features like C++, you didn’t have the God given liberty of performing pointer arithmetic, etc. But for the most common tasks, let’s say 80%-90% of the code that needed written, Java hit the sweet spot and so became the M16 of enterprise development. Of course, Microsoft followed that up with .NET and gave Java some good competition in the statically-typed/interpreted space. So much so that recent numbers seem to be showing something close to a 50/50 split in the market. But even with all the adoption of byte-code languages you still had 10%-20% of code that still needed a lower level language like C.

The adoption rate of Java and .NET provides a pretty good baseline for what you should expect from a sweet spot language in a functioning technical community. i.e. When something comes along that does 80% of what you’re doing today better than what you’re using today, you need to switch. But something strange has happened over the past couple of years that, I believe, has shown that sweet spot languages aren’t guaranteed to overthrow their predecessors just because they can do 80% of their chores better. If 80/20 was an absolute law, dynamic languages like Python, Ruby, and most recently Groovy would be eating away a far greater number of lines of Java and .NET code right now.

The case for dynamic languages is, in my opinion, not something we can look down on from our static-typing ivory towers and laugh at. The amount of code required to perform almost any task in Java can be reduced three-fold in any of the dynamic languages I mentioned. Combine this with the fact that unit testing has reached massive adoption and you end up with an interesting situation: using a dynamic language means that you can write the main code and the unit tests with less code than it takes to write just the main code in a statically typed language. Now combine this again with the fact that the dynamic languages I mentioned are, contrary to popular belief, strongly typed, and that with unit tests to cover the code, you get most of the type checking benefits of a strong/statically typed language plus the added benefits of unit tests and I think it’s hard to argue that we’re not looking at a new M16.

This isn’t to say that we get rid of statically typed languages altogether – they are still extremely relevant and I believe static and dynamic languages should be used together. In fact, this is where most arguments for dynamic languages make a huge mistake in my opinion. I've been convinced that trading off compile-time type checking for less code, more tests, and greater flexibility is always the right decision. Always! But I am not convinced you can make the same tradeoff for some of the other benefits of statically typed languages.

Static typing is useful for more than just compile-time type checking and you lose a lot of these values when you move to a dynamic language. One benefit that should immediately come to mind is performance. Static typing will always provide speed improvements over dynamic typing. Here’s another one that I think is overlooked and that I've only come to really value after working with Python for awhile: static typing is documentation. In fact, I'm going to say that this is the primary benefit of static typing. I really like being able to go look at a method signature and determine exactly what needs to be passed. Not only that but the tools are able to use this information as well to do all kinds of nifty stuff like provide code-completion, generate API documentation (Javadoc), or even automatically spit out WSDL to turn a Java interface into a SOAP based service. These are things you just cannot do reliably in a dynamic language like Python and it’s one of the very few things I miss from Java.

I think we need to start looking at the things we build a little more closely. OOP taught us to think about everything as a set of objects aligned on a single plane but I'm finding that the really strong OOP characteristics start fading as you move up out of reusable-component land and into real application land. It seems like we should be able to draw a line between reusable object components/libraries and the glue code that puts the reusable objects together to make them do something useful.

Static and dynamic languages seem to fit these two planes nicely. Reusable object models generally provide lower level functionality. It’s important for these to have well defined interfaces, strong documentation, and this is also usually the most effective place to be thinking about optimization. I think using static languages at this level makes a lot of sense. Now move up into the messy glue code area where a real application starts to take shape. Things like optimization become less of an issue, and code in this plane is rarely reused. Further, it’s important that we’re able to try a lot of different variations with frequent refactoring. Dynamic languages make this type of work much easier.

And now a prediction: the first environment to successful mesh static and dynamic languages into a coherent platform will win the interpreted byte-code market. Right now I see two-and-a-half contenders.

Sun has the best situation but they are stupid. Groovy and Java are a perfect match. But unless Sun takes Groovy into the core and makes it an official sibling of Java, it’s just not going to have the uptake it needs to be successful. Groovy has to be everywhere javac is. The only thing I can speculate here is that Sun went through so much trouble to ensure that Microsoft didn’t hijack Java that they are now having a hard time justifying a variation like Groovy. Or it may be because Gosling is holding onto the belief that Java is just C++ for incompetent ass-hats. Another language would just send the average Java developer into a coma. Sad.

Sun:

  1. Stop what your doing.
  2. Call Bob.
  3. Pay him and his minions whatever they want.
  4. Push your little JSR through for Groovy.
  5. Profit!

Microsoft is in a worse situation but they are smart (and evil). Case and point (and tragically hysterical too). First, Jim Hugunin writes an entire implementation of Python in Java that runs on a JVM and can even compile Python down to Java byte-code. He does this in his spare time, to make a point, I suppose. Next, Sun doesn’t notice. Seven years later, Jim releases IronPython, an entire implementation of Python in .NET that runs on the .NET framework and can even compile Python down to .NET CLR byte-code. The next day, Microsoft hires Jim on as a core CLR hacker to improve support for dynamic languages on the CLR.

And last but not least, the dark horse in all of this: The Mono Project. Here’s why I think these guys might be in the best position of all: first, they aren’t stupid. Second, they aren’t evil. Last, they don’t have 50 other languages complicating things. I have a feeling that Python might get lost next to the other languages supported with .NET. In my opinion, you need to take the best static language and the best dynamic language, mash them together and get rid of the rest of the cruft. Mono is in exactly this position with C# and Python. But they have other issues, not the least of which is that they have a hard sell in trying to convince the GNOME hackers to adopt technology spewed forth from Redmond. It’s hard to blame them with Microsoft’s recent patent activity.

I have a feeling I’ll be sticking with good ol' CPython backed by good ol' C code for the next year or so but this will be fun to watch… maybe.

Web Antipatterns

Posted over 9 years back at Ryan Tomayko's Writings

Check out this Review of Firefox at Net Gazette. It looks like a good old fashioned site at first but look closer. There is no actual text! The review consists completely of images… images of text. Each paragraph is a separate image arranged into tables for layout. What’s even more interesting is how they, uh, “implemented” links. Each image-of-text has an image-map attached to it defining each links coordinates. I realize we need all the good press we can get but this site championing your browser is a bit like having the Ku Klux Klan back your presidential candidate.

I thought this was a really great example of a web architecture antipattern and it reminded me of something I wanted to point people to. The w3c recently put out the Architecture of the World Wide Web as a Proposed Recommendation. I really cannot express how much I like this document. It’s very different from the recent slew of crap that has been flowing from the w3c. It isn’t a specification at all, really, but rather a look at what the web got right. Sadly, using images-of-text-wrapped-in-tables isn’t covered.

All content that comes to the web goes through a sucky phase before reaching true Web Zen. Reaching this elevated state requires answering the question, “What is it about the web medium that improves on existing forms of content delivery?” and then seizing them.

The benefits for most types of textual content is well known. That’s why sites like the Net Gazette are so baffling. Bringing your rag to the web means that the text can be indexed by search engines. I can select some text, copy it, and paste in an email to a friend. Innovative services like Technorati can do amazing stuff by drawing associations based on linking characteristics. And the list goes on… However, when content first comes to the web, it takes awhile for the providers to realize these benefits and so you sometimes sees glitches in the system like the Net Gazette.

Publishers aren’t the only ones that have this problem. A large portion of the technical community still considers anything that can be had through a web browser to be web-based. This misconception has led to a mess of “web apps” built using technologies like Active X, Java Applets, and Flash to create pseudo web-apps: apps that run in a browser but provide none of the real benefits of being part of the web. You don’t see a lot of these floating around on the public web but they are rampant in the enterprise/intranet space (which, btw, can benefit from web architecture just as much as the public web). “Web-based” means taking advantage of the web’s core architecture: URLs, open (and somewhat standard) data formats, and HTTP. The browser is just one tool for tapping into the ether spawned by this architecture.

It’s fun to point at the Net Gazette and chuckle for them Not Getting It when we have so many cases of content providers that do get it. Filet Mignon But the Net Gazette isn’t alone and I want to use this occasion to make a bigger point. The big multimedia content providers (read: Networks) still don’t understand the differentiating features that make the web an attractive medium for content delivery. They are trying to emulate TV and radio over the web. This is a mistake. The web isn’t a TV and it isn’t a radio and it isn’t a paper-book either. Trying to force it to be is like trying to cook filet mignon in the microwave.

I wonder how it’s possible that the news sites, radio stations, and book publishers have been doing this web thing for years now and have yet to ask the most basic question: “Why?” Why put audio/video/books on the web? The TV is far superior device for delivering video the way it’s delivered today. The radio is far superior for delivering audio the way it’s delivered today. The book is far superior for delivering story the way it’s delivered today. What does the web bring to the table? Well, not much at the moment. Today, you can go watch a clip of some news when you feel like it instead of when it’s time-slotted. So, we've found that time-shifted media is one benefit of the web. But what of linking, excerpting, copying, and indexing? The multimedia content providers have erected barriers to making these things possible and I'm not sure why (actually, I do know why in cases like copying and it has to do with lawyers).

The really big erection right now is “Streaming”. So what of streaming? Streaming is one of the shittiest ideas ever. In almost all cases just plopping a MP3 or MPEG file out on a server somewhere is the better solution. Streaming is all about trying to turn the web into a TV or radio. The only reason you should be streaming is if you are pushing out live content. Otherwise, it doesn’t buy you anything and acts as barrier to multimedia becoming a normal part of the web. Jon Udell has covered this for O'Reilly in Prime-Time Hypermedia and Marrying Hypertext and Hypermedia. He shows how audio and video could, with a few small tweaks, become proper citizens of the web.

People are starting to get it. There’s been a lot of chatter about Podcasting lately. This is a really simple concept that is being touted as, among other things, revolutionary. And it is revolutionary! But not for the reasons we usually call things revolutionary in IT, not because it’s using some really cool new buzzward technologies that’s more complex than what we have today. It’s revolutionary because it takes advantage of the simple ideas the web is built on and uses them to provide distinct features that you cannot get with radios and TVs today. Podcasting is an important step because it represents people accepting the web’s natural advantages and building solutions for them instead of in spite of them.

I also want to dump on Ebooks for a second. I'm so sick of hearing how Ebooks are a tremendous failure. I mean, they are a failure, but not because digitizing books is a bad idea. The publishers just haven’t asked “Why?” yet. I’ll defer to Corey Doctorow to make my point here, as he does it so well in The Microsoft Research DRM Talk:

New media don’t succeed because they’re like the old media, only better: they succeed because they’re worse than the old media at the stuff the old media is good at, and better at the stuff the old media are bad at. Books are good at being paperwhite, high-resolution, low-infrastructure, cheap and disposable. Ebooks are good at being everywhere in the world at the same time for free in a form that is so malleable that you can just pastebomb it into your IM session or turn it into a page-a-day mailing list.

Doctorow gets the web.

Dynamic Superclassing in Python

Posted over 9 years back at Ryan Tomayko's Writings

What you're getting into:

Mucking with builtins is fun the way huffing dry erase markers is fun. Things are very pretty at first, but eventually the brain cell lossage will more than outweigh that cheap thrill.

Barry Warsaw, 23 Mar 2000

Now, let's muck with some builtins.

How should a Python library provide for extensions? I'm working on a little system that has a nice object model. I would like to make it possible for people to extend the base objects with custom methods for doing whatever weird stuff people like to do.

Let us have a module, biz.py, with the following class definition:

class A:
  def __init__(self):
    self.x = 5
    self.y = 10

  def foo(self):
    print self.x

Now, let’s say we want to add a special bar method that would be kind of like the foo method but would print x * 3.14. We want to be able to do this from outside the original biz module; let’s say nuge.py:

from biz import A
def bar(self):
  print self.x * 3.14
A.bar = bar

Let’s give it a try:

>>> import biz, nuge
>>> a = biz.A()
>>> a.foo()
5
>>> a.bar()
15.7000000000000001

Note that we never actually use anything from nuge but we need to import it so that it can molest biz.A. Remember that module level statements are executed when the module is imported.

It’s also kind of interesting that you can modify the base classes of a class at anytime. Instead of adding a single method, it’s possible to add an entire class (or set of classes) into the bases chain, effectively grafting two (or more) classes together. Or, more precisely, dynamically superclassing A with B.

Let us redine nuge.py as follows:

class B:
  def bar(self):
    print self.x * 3.14

  def baz(self):
    print self.x ** 2
# this magic moment..
A.__bases__ = (B,)

This is equivelant to specifiying B as superclass when we define A:

class A(B):
  ...

The advantage is that we can do this at runtime without modifying A’s source. The effect is that B becomes a superclass of A and B’s methods are thus available on all A instances:

>>> import biz, nuge
>>> a = biz.A()
>>> a.foo()
5
>>> a.bar()
15.7000000000000001
>>> a.baz()
25
>>> isinstance(a,B)
True

I should note that assigning the tuple (B,) to A.bases overwrites the original, declared superclass(es). It is much wiser to combine the original bases value with the new class as follows:

A.__bases__ += (B,)

This appends B to the existing set of bases instead of just destroying them.

Some Code

Here’s the code I used to figure this crap out. Note that you will never find documentation that tells you all this, you have to play around.

class A:
  def __init__(self):
    self.x = 5

  def foo(self):
    print self.x

def bling(self):
  print self.x - self.x

class B:
  def bar(self):
    print self.x * 3.14

  def baz(self):
    print self.x ** 2

A.__bases__ += (B,)

a = A()
a.foo()
a.bling()
a.bar()
a.baz()
print isinstance(a,B)

Notes

  1. This isn’t a general purpose prototyping feature of Python. Most core Python classes and types do not allow assignment to bases.

  2. This should never, ever even be considered when you can use the pure subclassing mechanisms of Python.

  3. And even when you think you need to apply this technique, it is probably better to not if there is some other way. For instance, you can very easily and cleanly write functions that take the instance as a parameter.

Getting Rid of the Summary Field

Posted over 9 years back at Ryan Tomayko's Writings

I've found that the number of fields on a weblog post form is inversely proportional to how often I post. Given that, I've been searching for ways of getting rid of any fields that fall outside of the realm of absolutely necessary. One such field is the Summary field.

Moveable Type was the first weblog I know of that provided separate fields for a post's Summary, Content, and Excerpt. They intentionally left the meaning of these fields vague because they were really just separate slots in the database and you could use them anyway you pleased. You just needed to incorporate your definition into your template.

I thought this was a good idea initially but now I've come to dread the summary field. The truth is I'm never really sure what to put in there:

Should I excerpt something from the post? If so, how much? Maybe I'll just throw a two sentence description in there or whatever. Maybe I don't need a summary for this post since it's only a paragraph long.. Damn, then my index pages will look weird. Maybe I just won't post this because I can't figure out what I should put in this God-forsaken-summary-field. Wait.. which weblog is this? How are we using summaries? Maybe I'm suppose to use the excerpt field instead...

Alright, this is already enough auxilary thinking for a week's worth of posts. We need this to stop. I decided to forsake a bit of control over what gets put in a summary and let the code figure it out. I've allowed for four levels of detail on index pages:

Full
The entire content of the post is extracted and displayed on the index page.
Excerpt
Extract an excerpt from the full content. The excerpt is determined by first looking for a <excerpt> tag and if that's not found, grabbing the first paragraph of the entry. This seems like it hits a nice 80/20 for me. I usually just want to pull out the first paragraph as an excerpt.
Brief
The first sentence of the excerpt is extracted from the content of the post.
None
Nothing at all is extracted from the content of the post.

Should Linkblogs Trackback and/or Pingback?

Posted over 9 years back at Ryan Tomayko's Writings

Linklogs don’t generally add much discussion to the original post. I'm wondering how most bloggers think of trackbacks/pingbacks and whether there is any kind of etiquette around their use. Do people consider trackback/pingback as a sort of remote comment or are they useful purely for tracking what is linking to what? Services like Technorati and Feedster track what links to what and are becoming more reliable. I personally would like tracks/pings any time someone links to me but I could see how this could be considered linkspam.

Bosworth on WS-Mess

Posted over 9 years back at Ryan Tomayko's Writings

Adam Bosworth, usually a staunch supporter of SOAP and the rest of WS-Mess, makes an interesting sidebar statement in the last paragraph of his recent post about PC’s and Media Revamped:

I have a posted comment about just using XML over HTTP. Yes. I'm trying, right now to figure out if there is any real justification for the WS-* standards and even SOAP in the face of the complexity when XML over HTTP works so well. Reliable messaging would be such a justification, but it isn’t there. Eventing might be such a justification, but it isn’t there either and both specs are tied up in others in a sort of spec spaghetti. So, I'm kind of a skeptic of the value apart from the toolkits. They do deliver some value, (get a WSDL, instant code to talk to service), but what I'm really thinking about is whether there can’t be a much simpler kindler way to do this.

If you've followed Bosworth before, you’ll notice that this is a pretty big statement.

Guido's 10-line Python Scripts

Posted over 9 years back at Ryan Tomayko's Writings

My 10-line python scripts are just like everyone else’s except I wrote a script to interpret them. — Guido von Rossum

Pulled from Corey Doctorow’s notes on ETCON 2004

How the other half lives

Posted over 9 years back at Ryan Tomayko's Writings

I can never remember the names of the meta-windows used in HTML (as in <form target="_somerandomnamepulledoutofahat">). Partially because they are named so poorly but also because I haven’t needed to use them in a really long time (pop-up windows are soooo 90s dontchaknow). Anyway, I'm googling for “target window _blank” when I stumble upon the following message board post: Pop-up prevention and target=_blank. The question is whether anyone had heard of these crazy pop-up blocker thingies and how they could be defeated.

Some of these posts just blew me away. I mean, I knew these people existed but I always assumed the “How to be evil” forums were private. Check this out:

Up until November I used the “target=_blank” and made 0 sales. Once I deleted the “target="blank”, I immediately started getting sales.

It’s that easy!

This one is even better. Someone had asked for a list of names of these alleged “pop-up blocking hacker tools” and this guy responds:

The blocker on my pc at home blocks target=blank. I am not home to check what the name of the program is, however. I know that Google allows target=blank, but I fear I am losing sales to those other programs which do block them. I just removed the target=_blank, and we will see what happens to conversion.

So maybe I understood this wrong, but is he saying that he personally uses a pop-up blocker to get rid of annoying and intrusive adverts that make his computer unusable but he is, at the same time, fearful that others might be using a pop-up blocker to get rid of his annoying and intrusive adverts and, therefore, has decided to make his adverts more resilient so as to return them to their original level annoying and intrusiveness?

And just when I had lost that last scrap of respect for saleskind, this guy comes along with some actual fricking sense:

We've made a campaign about not using popups, and educatd users as to right clicking and how to block them – as I detest popups extensively, if I want a new window, I’ll right click and do it myself.

In fact, if an affiliate uses any popups, I’ll find another affiliate to buy that product from.

This has led to us selling a huge amount of pop up blocking software and privacy software even though the affiliate site has nothing to do with either.

Information is powerful, show people how to do stuff and they’ll buy…

Beautifully put.

Culture War

Posted over 9 years back at Ryan Tomayko's Writings

The Social Science Research Network has published a paper by Dan Hunter entitled Culture War. The first 13 pages deliver a tremendous account of the origins and current state of the Free Culture / Copyright Reform movement that started around 1999. I wanted to get a link out to this, primarily because it’s a great read, but also because it is laden with footnotes to the point where it could be used as a reference for major events and milestones in the movement.

The thirteenth page wraps up the background portion of the paper and ends with the following paragraph, which sheds a little light on the title:

This is the nature of the culture war which is currently being waged. Unlike the conflict between the left and right in US politics which is often called the “culture war,” this isn’t a war between cultures, but a war over our culture. Who owns it, who controls it, who can use it in the future, and how much will it cost? For the first time since intellectual property began its inexorable expansion there are signs of popular discontent at just what the private interests had taken from the public.

Once a thorough background is provided the paper then attempts to refute the oft-cited claim that the “Lessigist” Free Culture movement is fundamentally Marxist by showing that all efforts in this area have been around reinstating regulations on IP, not removing IP entirely:

[The premise of the FC movement] is the recognition that private property systems function better if some limits are placed upon property ownership and the market; otherwise the market will consume itself.

The last part of the paper is undoubtedly the most controversial. It states that unlike the “Lessigist” Free Culture movement, the Free/Open Source Software movements and their spawn in other creative forms like Wikipedia, South Korea’s Ohmynews, The Distributed Proofreader’s Project, and the blogging phenomenon are indeed fundamentally Marxist and can only be classified as revolutionary. All of these new forms of content creation and distribution remove the defining role of capitalistic systems – the dominant intermediary between producer and consumer that provides capital funding and controls/owns distribution and reproduction.

The standard justification of intellectual property, the reason that it’s supposed to exist at all, is that without intellectual property interests no-one would have any reason to produce cultural, creative content. Any creator would undertake a rational calculus, recognize they will get nothing without property rights in their intellectual activities, and go off to become a tax attorney. But the open source movement shows that this fundamental justification simply doesn’t hold: many people will produce creative content even outside what we can think of as the capitalist underpinnings of intellectual property. It’s a small step to go from this to a Marxian revolution: the open source movement promises to put the means of creative production back in the hands of the people, not in the hands of those with capital.

And again, I want to note that there are a total of 130 footnotes with links and citations to other works that together define the ongoing war on/for creative culture.

The paper is available as a PDF document at SSRN: http://ssrn.com/abstract=586463

Via boingboing.

Cleanest Python find-in-list function?

Posted over 9 years back at Ryan Tomayko's Writings

I'm trying to find a Python built-in that will return the first item i in a sequence that given a function f returns i when f(i) == True. I need this about every 300 lines of code or so (that's actually quite a bit of code in Python). The general use-case is running through a list looking for an item matching some criteria and then returning it. This is more commonly dictionary-land (i.e. the items should be stored in a dict keyed by the criteria instead of a list) but that's not always practical/needed.

So here's a quick find function:

def find(f, seq):
  """Return first item in sequence where f(item) == True."""
  for item in seq:
    if f(item): 
      return item

To illustrate usage, let's say we had a list of People (peeps) and one of them has the name Fred. We want to get the Fred People instance. I would write this as:

for person in peeps:
  if person.name == 'Fred':
    fred = person

While I admit this isn't a ton of code and is really very descriptive, I find that quickly nesting code structures like this bother me (especially in Python). Here we have three lines of code and three blocks. I'm unaware of any reason why this might be bad other than that I think it looks ugly, but there it is.. Anyway, this three line nastiness is reduced down to a single line using find:

fred = find(lambda person: person.name == 'Fred', peeps)

I should note that the same can be achieved with the filter or reduce built-ins but they both require full list traversal where find requires traversal only until a True result is encountered. You will also need to have some kind of fallback for when none of the items match.

# using filter:
fred = filter(lambda person: person.name == 'Fred', peeps)[0]
# using reduce:
f = lambda x, person: person.name == 'Fred' and person or x
fred = reduce(f, peeps)

A downside is that using find is less efficient than the traditional triple-nesting traversal approach due to the additional function call overhead of each iteration. However, this pattern seems to show up frequently in areas of code that don't get called very often. If lookup speed is a concern, you will likely have a dict available anyway. This seems to come up in fringe case lookups that are too rare to justify the extra space and insertion time of a dict.

Another critique might be that while the number of lines have been reduced, readability has not increased because of the nested lambda form. My personal opinion is that this is probably bordering on minor-obsfucation but is worth being able to remove the triple nesting structure.