Saturday, January 1, 2011

Top ten predictions about software development in 2011

1. Java becomes irrelevant.
2. Objective-C continues to be more and more relevant.
3. JVM scripting languages get a lot of attention but nothing emerges as a clear winner.
4. The triumph of Postgres; MySql is already irrelevant, but MySql forks gain traction.
5. Solid State Disk hits the mainstream.
6. Cloud computing becomes mainstream.
7. iPhone on Verizon cannabilizes Verizon's Android Sales.
8. Touch interfaces are the hottest topic in UI development.
9. One or more major security incidents galvanizes a movement towards secure software development.
10. What replaces Java as the mainstream development language?


Java Becomes Irrelevant

I am not predicting that Java will disappear any time soon (or ever). There are still and will continue to be lots of people using Java (including me), and lots of energy will be put into libraries, IDEs, books, and all other things Java. But mindshare decreases before market share does, and that is what's happening in the Java world right now. You can see this most clearly by simply applying the Tim O'Reilly rule: What are the alpha geeks doing?

Answer: They're not doing Java. Oh, they're doing lots of things that are Java-related (NoSql/big data, Cloud computing, JVM languages like Scala and Clojure, widespread Agile adoption, Android development, etc.). But note that none of these things actually require Java. The only things happening in the language itself are incremental (e.g., Project Coin, OpenJDK) with significant controversy (and subsequent delay) around more radical changes (closures/various lambda proposals, etc.)

Oracle is and will continue playing a big part in accelerating this trend; I think Oracle sees Java as part of its super-platform strategy, in which Oracle locks customers in by providing the key elements of both the hardware and software stack. This strategy, taken to its logical conclusion (and you can bet Oracle will), is guaranteed to alienate everyone who doesn't have a stake in Oracle's success - like every alpha geek who doesn't work for Oracle.


Objective-C Continues To Be More and More Relevant

Objective-C is one of the fastest growing languages today and the iPhone and the iPad are the obvious reasons why. Cocoa is arguably the best UI framework out there, in the same way that Michael Jordan is arguably the best basketball player ever. But I think what gives this trend legs are the language itself - it's everything C++ should be, and isn't - and Cocoa. Right now, it's limited to Apple's walled garden - but it doesn't necessarily have to be that way (see item #10 below).


JVM Languages Get a Lot of Attention, But Nothing Emerges as a Clear Winner

Clojure and Scala have the most buzz right now with Groovy and JRuby hanging in there and a host of others bringing up the rear. But nothing is going to emerge as the choice for a JVM scripting language. By their very nature, the audiences for Clojure and Scala are limited; your average Java programmer isn't ready for functionally-oriented languages based on Lisp or Scheme, and never will be. Joe average can handle Groovy or Javascript, but neither one seems to have taken off. And I don't think anything will, which may contribute to Java's growing irrelevance.


Solid State Disk Hits the Mainstream

I think this is true in two different scenarios. The first, obvious, one is that SSDs will get packaged in laptops, a la the Mac Air. I keep hearing people with SSDs in their laptops that "it's life-changing" and they'll "never go back", which sounds a little hype-y, but still seems like a ringing endorsement. The second, less obvious but more interesting scenario is that SSDs are used as super-caches in systems that have data that is too large to be kept in main memory but must be accessed quickly. The most obvious use is as a second-level database cache, but there are lots of other possibilities as well. I do not think people will replace conventional disks with SSDs on servers because the price of disk continues to fall as fast or faster than the price of SSDs; but we will increasingly see 4-tiered storage: main memory, SSD cache, disk, and archive (slow disk, tape, etc.).


Cloud Computing Hits the Mainstream

That's the developer mainstream. It hits the consumer mainstream in 2012 or 2013. But right now, a good percentage of those alpha geeks and an even bigger percentage of the entrepreneurs and corporate innovators are chasing their dreams by developing cloud-based apps and services. Most, of course, will fall short. But I'm willing to bet a few big winners are being created right this moment. In a few short years we'll look back and say "remember when you used to store all your stuff on your own personal computer?".


The Triumph of Postgres: MySql Is Already Irrelevant, But MySql Forks Gain Traction.

It's become clear in the last year or two that Postgres is increasingly the Open Source DBMS of choice. There are two good reasons for this. First, it now has 10+ years of continuous innovation and improvement. Second, it's still actually free, as in "I don't have to get a contract with MySql AB/Sun/Oracle to distribute my app like I now do with MySql". Especially now that Oracle's phalanxes of flesh-eating zombie lawyers are ready, willing, and able to enforce those contracts.

And isn't it interesting that so many of interesting new DBMS engines have their roots in Postgres, either directly (GreenPlum, Aster Data, ParAccel) or indirectly (Vertica, VoltDB)? The only good news for MySql is that a number of forks or new DBMS engines based on MySql are emerging - most notably, Drizzle, Percona, and InfoBright.


iPhone on Verizon Cannabilizes Verizon's Android Sales
[Full disclosure: I stole this from Buzz Out Loud's 2011 predictions]

I think people are buying Android phones on Verizon simply because they don't want AT&T and are passing on the iPhone for now. But as soon as it's available on Verizon (and I and everyone else are predicting that it's finally this year), there will be a flood from the release of pent-up demand. If Windows 7 phones are at all compelling, Android will take another bit of a hit. However, I think Android will still be #1 overall, just a little less so. And there will be a very healthy competitive situation in the Smart Phone market, which I think is all to the good - Apple at this point in their history wants to be complacent and arrogant (I don't know about you, but I find myself increasingly annoyed with them). The spur of competition is probably the best antidote to that unfortunate tendency. So here's hoping we also see some high-quality tablets based on WebOS, Chrome, or maybe even Windows...


Touch Interfaces Are The Hottest Topic In UI Development

OK, so this is more an observation than a prediction. The really interesting story to me is what happens with Flash this year. Adobe made a huge strategic mistake by not concentrating on mobile and is desperately trying to catch up. The early returns were not good. In the meantime, the majority of UI developers out there are learning how to create touch interfaces on the iPhone and iPad.


One or More Security Incidents Galvanizes a Movement Towards More Secure Software Development

There has been a steady stream of security incidents over the last few years, but nothing that has really grabbed the general public's attention. Yet. The WikiLeaks saga, while it is certainly not a computer security issue, has - and it sets the table for a major incident that is actually a systems security issue. It could be something like a massive compromise of security on Facebook or something like a Stuxnet (imagine if the systems for the power grid were compromised and we had rolling blackouts all across the US or Europe) or even a full-blown Cyber war somewhere bigger than Estonia. Or maybe all of the above. Whatever it is, it's coming. And the result for our profession will be a demand that we demonstrate that our systems are secure. While we'll never be able to explain that we cannot guarantee absolute security, we will increasingly adopt development practices, techniques, and tools that are more secure. We've been in an era of laziness and avoidance in the area of computer security; that era will come to an abrupt, jarring end very soon.


What Replaces Java As The Mainstream Development Language?

I've heard more than one guru/pundit say that it's a language that hasn't been developed yet. Maybe so. Maybe someone will invent a language that simplifies concurrent programming in a multi-core world, extends the imperative programming paradigm in some interesting way, is simple enough for the majority of programmers to migrate to, and has the marketing muscle and hype to gain mind- and eventually market-share. Maybe. I think it's more likely that we're going to extend the environments and tools we have now, at least in the short term. I'm guessing that there are two likely paths:

1) The JVM Ecosystem
- and/or -
2) The Objective-C Environment(s)

The JVM ecosystem is an extension of the Java environment we already know. But most of the extending is down non-Java paths, and systems are developed using less and less Java. In this scenario, you develop a code base on a Spring backbone (with the more sophisticated leveraging OSGI), using either Hibernate to talk to an RDBMS or a Thrift-based library to talk to a NoSql data store. Most of the code is written in the JVM scripting language or languages of your choice, with a heavy dependence on calls to the Java libraries and as little Java "glue" code as you can manage. The UI is HTML5/CSS with Javascript libraries, especially for touchscreen UIs. Now, this doesn't contradict what I said in item #1 at all ("Java becomes irrelevant"), because you're doing just about everything you can to avoid writing any actual Java code. Instead, you're leveraging the tools, techniques, and knowledge that already exist in the Java world and incrementally pushing away from Java itself.

The Objective-C environment is me going out on a limb. But bear with me - I think something very interesting is going to happen here. More and more, the systems most people interact with are not going to be traditional personal computers - they're going to be smart phones and tablets. Obvious, right? Equally obvious is that Apple dominates mindshare on these devices as well as possessing considerable market share. So there will be a lot of developers developing iPhone and iPad apps. And I suspect many will have the same epiphany I recently had - Hey, wouldn't it be great to do all my development in Objective-C?

Two huge forces would conspire against this. The first is Apple. Keeping iPhone/iPad/Mac development in their walled garden isn't just Apple's strategy, it's their DNA. There is no scenario, ever, in any of the multiverses, where Apple opens up Cocoa or any other part of their environment so that it can be used on other platforms or be extended by third parties or interoperate with anything else. No matter whatever else happens, Cocoa is off the table.

The second is the Java/JVM ecosystem, aggressively defended and extended by Oracle (And it's no accident that Apple and Oracle are now effectively partners. It's not just that they want to gang up against Google. It's also that they potentially complement each other - Apple wants to own the eyeballs and Oracle wants to own the data center - with very few areas of contention). The size and inertia of this environment argue against any radical change. There is certainly no equivalent in the non-Apple Objective-C world to the tools, environments, libraries, etc. available in the JVM ecosystem.

But.

Objective-C is right there in the gcc compiler we all know and love. Most developers don't know about GNUstep and even fewer have used it, but (not unlike Postgres) it's chugged along steadily for 10+ years. All it takes for an Objective-C revolution is that enough of the alpha geeks decide they want it and enough of the rest of us follow along behind them. I'll remind you that a big reason the Mac caught on post-2000 is that the alpha geeks adopted it as the cool machine to have, and the rest of us followed. Everything we need can be developed as open source. And there is no way Apple and Oracle can stop that.

Will it happen? Right now, it seems unlikely. The JVM ecosystem seems more where we're going to go. Most server and non-Apple UI development will be done in that ecosystem, and it will provide the back-office systems for iPhone/iPad applications that need access to corporate data. But it has one huge problem:

It's boring.

And an Objective-C revolution isn't. For that matter, a new language that shifts the imperative programming paradigm isn't either. So one of those things is going to happen. Stay tuned.

Thursday, September 16, 2010

Generating ID sequences for Postgres using JPA annotations

This started out as yet another too-long blog post, so I'm going to refactor it by cutting to the chase:
@MappedSuperclass
public abstract class PostgresDomainObject {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
@Column(insertable=false, updatable=false,
columnDefinition="BigSerial not null")
public long id;
...
}
The BigSerial data type in Postgres is shorthand for the following:

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname

The key to making this work is GenerationType.IDENTITY. The insertable and updatable attributes don't seem to actually work - I've left them in as a indication of how the column behaves, but they don't seem to have any effect (at least not without IDENTITY). But using IDENTITY as the generation type causes Hibernate to not include the id column in insert statements, meaning that the default will get used.

The really beauty of this is that classes that extend PostgresDomainObject and declare @Entity will get their own dedicated sequence. If I had declared the column as a bigint and used GenerationType.AUTO, it would have created a single sequence, called "hibernate_sequence", and that sequence would have been used by every table. If I had declared the base class as @Entity and used the Hibernate/JPA table inheritance model (an atrocity I may rant about in some future post), I would again have been limited to using a single shared sequence.

There are two good reasons for wanting to have a dedicated sequence for each table instead of a single shared one:

1) Performance - having sequence generators dedicated to each to table helps to prevents contention when there are multiple concurrent requests for new sequence numbers

2) Maintainability - sequences are often allocated in batches (say, 50 at a time) and some errors can cause the entire batch to be skipped. If there is a recurring error on a busy system, it's even possible to exhaust the sequence. It's also possible to exhaust the sequence on a very large table that has a lot of turnover. In those cases, unlikely as they may be, it's a lot easier to track down the problem and fix it if it's a dedicated sequence.

If the database in question is a simple, low-volume data store, none of this matters much. But the systems I care about are at the other end of the spectrum - either high-volume OLTP systems or data warehouses that have tables with millions of rows and high levels of concurrent access.

Monday, August 30, 2010

Innovation in the MySql market

The previous post might have made you think I don't care much for MySql, and you'd be right. But there is some interesting innovation going on in the MySql world - it's just not in MySql itself.

Drizzle looks like it could be to MySql what FireFox was to Mozilla - a cleaned-up, stripped-down version that improves all the good stuff and leaves the dreck behind. Here's a terrific talk by Brian Aker (former architect for MySql and one of the movers behind Drizzle) that made me want to go out and try Drizzle in spite of my aversion to MySql.

Infobright is one of the new breed analytic DBMS's. It's build on MySql, but uses its own column-oriented storage engine, data compression, and optimizer. It looks interesting, and I see that one of their technical advisors is Roger Bodamer, who I worked for at OuterBay Technologies - a startup that was bought by HP in 2006. Roger is also the SVP of Engineering at 10gen, the company behind MongoDB - so it looks like he's sitting on both sides of the new SQL/NoSql fence. Smart guy.

Big Changes in the Data Warehouse Space

It's been about 2 1/2 years since I looked really seriously at the data warehouse/BI space. At the time, things seemed very stagnant - there were the big 3 or 4 vendors (Oracle, IBM, Microsoft, maybe Teradata) and a host of smaller players who either were getting gobbled up (Hyperion, Business Objects, Cognos, etc.) or spiraling downward (Sybase, MicroStrategy).

In fact, this seemed to be true of the DBMS space as a whole. A market where the trendiest and most disruptive technology is MySql is not an interesting market.

Turns out that wasn't the case at all. Lots of interesting things were bubbling beneath the surface then and have been emerging ever since. The big news in the software development world is NoSQL/MapReduce/Hadoop, because it's been about a decade since Object-Oriented databases failed and so it's time for another round of "relational databases suck and here's what's going to replace them". (Actually, I like the whole NoSQL thing. But it's not a replacement for DBMS's - instead, it's a big part of the solution for a new class of data storage and management problems. But I digress).

What I think is more interesting is the new breed of analytic DBMS engines - Vertica, GreenPlum, AsterData, ParAccel, VoltDB, InfoBright, Netezza - and the attempts by the established vendors to keep up (Oracle Exadata, and Microsoft's acquisition of DATAllegro and subsequent release of SQL Server 2008 R2 Parallel Datawarehouse). Weave in threads like the re-birth of the database appliance, the emergence of Solid-State Disk as a mainstream technology, and the adoption/adaptation of MapReduce by some of the above vendors, and now we've got something really interesting.

What's interesting about the technology of the new breed is the way different architectural trends are being synthesized together. Massively Parallel Processing (MPP) and shared-nothing architectures are nothing new in the database world, but here they've been strongly influenced by the architecture of scaling out on commodity hardware that owes more to Google and the other big web sites. Columnar storage, compression, and tiered storage schemes aren't new ideas either, but combining them together, especially in the context of a database appliance and/or the cloud is a big step forward.

The other really interesting development is a new approach towards analytics, known as "MAD" - Magnetic, Agile, and Deep. I won't pretend that I understand what it really means yet, other than to note that this is another area of software development where dissatisfaction with the status quo has resulted in a new wave of people and products who have adapted an Agile world-view.

I think EMC's acquisition of GreenPlum is an inflection point that, along with the success of Vertica and Aster Data (and the large installations that all three of them now have), legitimizes the new breed to the mainstream datawarehouse customer.

As an aside, it's worth noting that a lot of the interesting stuff going on can be traced either directly or indirectly to Mike Stonebraker. Vertica and VoltDB are his latest projects, GreenPlum and AsterData are built on top of Postgres, ParAccel also uses some Postgres technology, and DATAllegro was originally built on top of Ingres.