"The technology that you teach to your students may be state-of-the-art — the cutting edge of state-of-the-art — but by the time they hit the job market, that technology is obsolete. That's the world that we live in; you just have to learn to deal with that. The attitudes that you inspire in your kids are going to be with them forever. If you teach them to adapt and to look beyond what everybody knows, then you have a kid who has the potential to do something major." —Jim Starkey
Database architect Jim Starkey, an innovator who was working in computer science right at the cusp of the adoption of the Internet, spoke at the STEM² Summit about pioneering special database technology through InterBase, lessons he learned along the way, and how educators can prepare their students for the next tech disruption.
In this transcript of his talk at the Summit, you'll read about:
- His career in database innovation
- How he utilized higher-order thinking throughout his career
- Four lessons about innovating in seemingly-impossible circumstances
Scott Morrison: All right folks, we'll get started. Thank you so much for joining us today. Second presentation of the morning, we have James Starkey with us.
Jim is a pioneer in Internet database technology. Jim grew up in River Forest, Ill., where he began programming in high school through a project with the Illinois College of Technology, and he earned a degree of math from the University of Wisconsin with highest honors. His 40-year career in software engineering has focused on the intersection of networks, databases, and data-access languages. He specialized in discovering the flaws in conventional wisdom.
In 1977, he created Datatrieve, demonstrating that the IT department need not have a monopoly on creating applications that store, analyze, and report on data. His work in databases show that relational systems are not inherently slow and that data consistency does not require readers and writers to block each other's work. More recently, he invented a new programming model for applications running on dozens of corporate and computers. He built a distributed relational database on that model which is consistent, partition-resistant, and available. Mr. Starkey holds several related patents. Let's welcome him with a round of applause.
Jim Starkey: Thank you. Thank you. It's a pleasure to be here. Two small things of a personal note.
The first in the previous presentation, there was a picture of Ernest Hemingway and a big fish. Ernest Hemingway went to my high school, so that's one.
The other is that it's a real honor to be here in the Kennels and science center. The last time I was in something called the Kennels, and it was Kennels' personal conference room in Maynard, at Deck. I'm an old Deck guy.
Twenty minutes is not much time. What I'm going to try to do is look at one innovation that I have done, of many. I'm going to skip through it very fast, because this is about two semesters’ worth of computer science which you are going to be forced to learn in five minutes. You can then forget it the second five minutes. I'm going to talk a little about the process of innovation at least from my experience.
This is what I do: I create technology, start a company, take a product to market, generally I get bought by a big company, go off, suffer through a non-compete, and then repeat. I'm on my third cycle right now.
Database systems — you've all heard of database systems. They exist to manage current access to volatile data. Duh. Everybody knows that.
Take, as an example, an ATM . There's some problems which are kind of obvious. Everybody who looks at one probably has the same problems going through their mind. What happens when you're moving money from a savings account to a checking account and the machine crashes. Do I lose my money? The second problem: What happens when I'm transferring money and they're trying to print my statement, does it come out wrong? Is it going to be confused?
The database solution is a concept of a transaction where you batch a bunch of updates in one unit. It's atomic — it either happens or it doesn't happen. If you're moving money from one account to another and it crashes in the middle, and machines crash we all know that, either it all happens or none of it happens. This concept of a transaction. A transaction also makes a database consistent and sees a stable view of the database.
Then of course this one, and everybody has thought of this: What keeps you from getting all of your friends at a different ATM in the city, and then somebody yells, “Go!” and you all take out the last $100 and head off to South America with your ill-gotten gains. Of course that doesn't work; the question is how do you make that work? How do you keep them from doing that?
The high-level view is contractions, transactions, control access to records so more than one guy can't modify the same record without bumping into each other. The traditional implementation of what are called software locks. They're objects inside of a database system, and a lock can be unlocked — you can have a shared lock, any number of guys can have a shared lock, and you can have an exclusive lock so only one guy has access to that record. The traditional implementation is that when a transaction is running, everything that that transaction touches, if it reads it, it gets a read lock; if it writes it, it gets a write lock. And this keeps two transactions from bumping into each other and doing inconsistent updates.
The problem with the traditional implementation is that, of course, people who are reading the database will block people who are writing the database. People who are writing the database will keep/block people from reading the database. This was fine back when we had batch systems, but when we started to have things like faxes and departmental stuff and interactive stuff, we started running into a really big problem. The ultimate nightmare is that somebody prints a list of all the employees in the company and goes to lunch — or worse, goes on vacation — leaving all of those records locked, and everything in the company grinds to a halt, because all of these records are locked and nobody else can run any transactions.
That was the problem. To get around it — and this was back around 1982, looking at this — I came up with this idea. I was driving down Route 3 and said, "I know how to handle this," and that is rather than using record locks, what we're going to do is create multiple versions of records. They're linked together, and we'll tag each one with the transaction that created that version, and then we'll do the bookkeeping to see who should see what. If you try to update a record, you have to make sure that's the most recent version otherwise that you can't. It completely manages a database system which is completely consistent and concurrent without having to tie up the database system. If two people try and take out the last $100, one of them will wait until the other one completes. Sorry, you missed it.
It's a really interesting technology. It has no record locks — nobody's blocked. Two guys try to update the same record, one wins, one loses. I introduced it in 1984 in a product put out at Deck internally. It was called JRD, which is Jim's Relational Database. I'm Jim. And shipped it. It worked fine. It worked fine.
The operating system that it ran kind of went away eventually. When I introduced it, there was a lot of skepticism on the part of other database developers: This isn't something they've been taught. This isn't something they used. This isn't something that everybody else knew about. The academics were extremely skeptical. They said, "Ugh, that's silly; that's not the way that's done; this is all wrong."
That was a long time ago. Thirty years later, now it's virtually in all major database systems. It's a different technique; it solves the problem; the solution is completely different from the solutions that preceded it.
Let me draw a couple of lessons from this. First — and this is really, really important — not everything that everybody knows is true. There's a lot of stuff that's folklore — common knowledge. It's just wrong. Much of innovation is sorting out the stuff that everybody knows that's true, and that which isn't true, and digging down the rat holes between the stuff that everybody knows is wrong and seeing if there are solutions to problems there that have been completely overlooked.
Thinking about this process, this goes back to a very similar experience I had when I was in kindergarten. William [Hatch] School, Know Park, Illinois. I don't remember the name of my kindergarten teacher, but I wish I did. I remember the day very clearly when in the kindergarten room, she's sitting in the chair, all the kids are sitting on the floor around her, and she says, "Why do ships float?" We said, "Everybody knows why that is: Because they're made out of wood." She said, "Some ships are made out of steel,