Categories
Career

Lessons learned in 35 years of making software

Today (3rd July, 2024) is my 35th anniversary working in and around the software industry. All but 18 months of that have been spent on software development teams; those errant 18 months were spent editing technology books. I figured out fast that it wasn’t for me, and came right back home to the software industry.

Me, working infinitely hard at one of my career stops

In 35 years, I’ve learned some strong lessons. Maybe they’re ones you’ve learned yourself already. We all learn at different paces and at different times!

Do things in the most straightforward way possible. It’s easy to fall into the trap of clever solutions, or clever applications of technology, or overbuilding something because you’re anticipating the future. Don’t do it. You will hate yourself for it later when you have to maintain it. Build the thing in the simplest way you can, as fast as you can. You can improve it over time as needs demand.

There is no substitute for working software in Production. I can’t believe now that I have been part of 18-month release projects. This was back in the bad old waterfall days, but even then it was possible to release a lot more frequently than that. The software we build is valuable. It builds the value of the company. When you hold it until it’s perfect, or everything you think it needs to be, you are holding back on building the company’s value. Find the fastest, shortest path to getting the smallest increment of the thing that will work into the customer’s hands. You can keep making it better from there.

Relationships matter if you want to advance. It took me until about ten years ago to start to understand how building relationships across any company I work for is critical if I want to move up, and even remain employed when times are tough. I’ve found that being relentlessly helpful to others, even in things that aren’t strictly your responsibility, keeps you as someone everybody wants on the team. And when you push for a promotion, you have a base of people across the company who think you’re awesome. It greases the skids.

Relationships matter if you want to see your vision come to life. You may have the best vision ever for the product and how it should be built. You may see something that isn’t working well and know just how to fix it. It’s a 90-degree uphill climb to make your vision reality when you aren’t well connected to people with the power to help you. Even if you’re at the lowest level of the organization, try to build relationships with key leaders.

Never be invisible. When I was still testing software for my supper, I’d just quietly work all day and then go home. I’m going to toot my own horn a little: I was fast and good at finding the most critical bugs. Nobody could touch me. But, critically, nobody in leadership knew it. When I wanted to move up, I had a hard time showing that I was ready. I needed to make sure my work was visible, so that leaders could see what I was capable of.

Build and maintain a network of people in our field, outside the company you currently work for. As I write this, I realize I have let my network fall off since the pandemic. I need to rebuild it. My network has saved my bacon more than once when I unexpectedly lost a job. I wouldn’t be at my current company without my network. I got introduced to a particular fellow about ten years ago and we had coffee. We really enjoyed the conversation and committed to coffee once a quarter, a habit we kept for many years. It fell off in time as his life took him in a different direction. He was connected to the CTO at this company, who was trying to figure out how to build out the QA team and test process. My colleague connected me to him, and I gave him some free advice that he used. He thought I was sharp enough that when his Director of Engineering left to start his own venture, he called me and asked me if I would be interested in the job.

Be willing. There have been plenty of things in my career that I didn’t know how to do, but when my boss asked me to do it, I simply said I’d figure it out. As you do this, it’s good for you in two ways. First, bosses like people who accept assignments and figure them out. Second, you then learn how to do the thing and can do it again; you’ve built skill. In my time at my current company, my CTO has asked me to do plenty of things I didn’t know how to do. I simply said yes and got on with figuring it out. Someone who already knew how to do it could have done it faster and better, for sure, but I got it done. We all have to start somewhere when we learn how to do something.

Chase adventure and interestingness, not salary and title. This might be controversial with some, for whom salary and title are critical. So be it. We’re ridiculously well paid in this industry, even at the entry level. I’ve had a varied career, from technical writing to QA to engineering. I’ve said yes to, or chased after, things that sounded like would be challenging and interesting, and therefore fun. The money and positions have come to me in time. I might have gotten them a lot faster had I chased after them instead. What I got was a fascinating career and a great deal of personal and professional growth. I regret a couple choices, and a couple other choices totally didn’t work out for me. But there’s so much opportunity in this industry (even though the downturn we’re living through has tightened that up for now) that you can recover from that.

Challenge yourself to stretch past your natural tendencies. I loved it back in the waterfall days that I could pick up a big assignment and go work on it alone for weeks. Frankly, I still would love to do that. Those days were always so comfortable – come to the office, get some coffee, put on headphones, and chip away at my assignment all day. What I didn’t understand was that it kept me off everybody’s radar. I’m not a highly competitive person. I mostly want everyone to get along and work together. But unfortunately for me, there are times when you need to be competitive to keep your job, such as when a company has a rough time financially. I had to figure out how to show visible value all the time so that the sentiment was, “well, we can’t let Jim Grey go, he gets so much good stuff done.”

Understand that different social classes have different ideas about how the world works. I grew up working class, blue collar. We work in a white-collar industry. Because of my background, I thought that if I worked hard and did good work, I would rise. That was true for my dad, who rose from factory floor, to quality control, to quality manager, and finally plant manager. But in a white-collar world, it’s far more about relationships and power than about your work. I still struggle with this lesson. I still want it to be about my work. But it wasn’t until I built relationships and power that I was able to reach the Director level. If I ever want to make the VP level, I have even more work to do there.

When you deliver work you’re really proud of, you’ve almost certainly done too much and taken too long. I have a bit of a perfectionist streak. I want to do my work well and thoroughly. It took me a long time to learn that when I do that, it’s for me, not for the company. When I’ve reached 60-80% of the thing being as good as I want, I’ve probably done enough.

The software we are building right now will one day be decommissioned and not be used anymore, probably before your career is over. I’ve lost track of all of the software I’ve delivered that isn’t running anywhere anymore. Some of that is because my career is 35 years long. But even some of the software I worked on five or ten years ago isn’t running anymore. This is another good reason to build small increments, just good enough, get them out there, and iterate from there. Anything more and you’ve overbuilt some software that isn’t going to run forever anyway.

Categories
Project Management

The Donald Rumsfeld School of Agile Software Delivery

By Jim Grey (about)

I often quip that when you plan and manage a sprint, as a leader you have to channel your inner Donald Rumsfeld.

You remember ol’ Don, don’t you? He was twice the Secretary of Defense in the United States, first in the Gerald Ford administration and later in the George W. Bush administration.

Rumsfeld is famous for having said, “There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns; the ones we don’t know we don’t know.” Here, listen:

He said this in answer to a reporter’s question about a lack of evidence linking Iraq’s government with supplying weapons of mass destruction to terrorists. It sounded so absurd at the time that the media had a field day ridiculing him.

But over the years, we’ve come to realize how much sense his statement makes. In anything you set out to do, you know what you know, you might know or be able to guess some things that you don’t know, and you certainly don’t know many things that you don’t know.

In agile software delivery — heck, in life:

  • You plan for the known knowns.
  • You make contingency plans for the known unknowns.
  • When unknown unknowns happen, all you can do is manage through them.

On a team I once managed, we wanted to build a feature to email end users the day after they abandoned a transaction on our site. We hoped to reengage them to finish their transaction.

We wrote and groomed the stories. Most of them had a clear implementation — plenty of known knowns. They were easy to estimate.

A few stories had known unknowns. One was, “It’s been a while since we’ve been in that repo, and it was written in our wild-west days. I might find some cruft in there that I’d need to straighten out to be able to finish this story.”

Another known unknown was, “I’d have to do a deep dive into that module to be sure, but if does this certain thing in this certain way, I’d have to do considerable extra work to stitch in the code I need to write there.”

In each case we discussed whether that work, were it necessary, would add meaningful complexity to the story and inflate its estimate. Sometimes it was yes, sometimes it was no. When it was yes, sometimes we wrote a dive story to find out and other times we thought the impact would be small enough that we could just accept the risk.

As we planned sprints with those stories, we believed we had accounted for everything we could think of, and were confident in our sprint plans. But two things went wrong we didn’t foresee — unknown unknowns bit us hard.

One story involved writing a job that would check overnight for abandoned transactions. The engineer who picked up that story encountered five or six serious unexpected difficulties with writing the job and with getting the job runner to pick it up properly. A story that we thought he would wrap up in two days spilled well into the next sprint.

Then it turned out that our email service was never built to handle the volume we threw at it, and it failed immediately in Production. One engineer devoted all of his time, and another part of his time, to solving the problem. A lot of our tester’s time went to troubleshooting with the engineers. The situation was surprisingly thorny. The engineers implemented several successive fixes over several days before finally getting past the problem.

Thanks to these unknown unknowns, we missed a couple sprint plans by a mile. Here’s how we managed through them: I worked with the team to prioritize stories and figure out which ones we were likely not to deliver. The engineers watched for stories ready to be tested, and helped our tester by picking some of them up themselves to keep the revised plan on track. Finally, I reset expectations with my management about what we could actually deliver, and how that would affect delivery of other work that depended on it.

You could argue that these should not have been unknown unknowns. Writing a job should have been cut and dried. We should have thought about volume in our planning, and tested for it.

Hindsight is always 20/20. Retrospectives are where you explore that hindsight and make improvements for the future. After you experience unknown unknowns, improvements generally take two forms:

They become known knowns, or maybe known unknowns. We learned a lot about the complexities of jobs in our world. We knew we needed to write a few more in upcoming sprints, so we broke them down into several smaller stories that we could test incrementally. Also, we now expected that there might be challenges in writing jobs we hadn’t encountered yet, and to lay in contingency plans for them.

You can lay in plans to neutralize them as unknowns in the future. We found out that other teams had been having trouble with the job runner. We met with the team that owned that code to figure out how to make the job runner more reliable. They put tech-debt stories in their backlog to handle it. We also researched how we might make the email service more robust and investigated third-party email services. We ended up going with a third-party service.

Categories
Quality The Business of Software

If you want your software to keep producing, be prepared to do some dirty work

By Jim Grey (about)

A woman named Verna built the house I live in. She landscaped it nicely; a sprawling flowerbed stretches in front of my front door and picture windows. Every spring, I eagerly await Verna’s spring color: yellow daffodils, purple hyacinths, red tulips, and finally the giant pink peonies.

Home

I’ve added a few things: lilies, mums, lavender, coreopsis, phlox. I love phlox! But my eagerness to keep adding color petered out pretty quickly because it turns out I hate digging in the dirt.

I don’t much enjoy any of the other routine garden maintenance, either. Mulching. Deadheading. Dividing overgrown plants. Weeding – oh god, the weeding. Does it make me lazy that I just spray my weeds with Roundup and move on?

I just want to enjoy the flowers. But this ninth spring I’ve lived in my home, a few of my plants didn’t come back as strong. A couple didn’t come back at all.

So I asked my mom. She’s the gardener in our family. “When was the last time you fertilized?” she said. “Um, never,” I said. “Ah,” she said.

It turns out that you can’t just ignore the soil, or the plants themselves for that matter. Things growing in it year after year uses up all the nutrients, and crowded plants compete with each other for what little is there. “I’m surprised your flowers didn’t stop coming back a few years ago,” Mom said.

Daffodils

I did some serious fertilizing this season. I also separated some overgrown hostas and moved some of Verna’s plants so they had some elbow room. Not fun, but necessary.

♦ ♦ ♦

I once worked for a software company whose flagship product sold briskly. Version 1.0 was five years in the past, and since then we’d added lots of new features so the product could continue to lead the market. And now here came the head of Product Management asking for more new features

Dan, a quiet fellow, graying at the temples, led Development. “Well, yes, we can add all those features,” Dan said, adjusting his glasses. “This one will take six months. That one will take four. This other one, well, I think that’ll take a year.”

The Product Manager was dumbfounded. “Features of similar scope took far less time in the past, and you had fewer developers then. What gives?”

Dan looked up at the Product Manager kindly, and drew a breath. “Well, we’ve been under such pressure to quickly add features to this product that we’ve not focused enough on its overall design. We’ve also made no time to keep our underlying architecture up to date. These are things I’ve been pointing out all along the way. But we’ve just bolted features on wherever we thought we could get away with it. Now, to add any one of the features you’ve requested, we basically have to unbolt three or four other features, and blend the code all together. And we have to write complicated bridge code to do modern things with our aging architecture, and when that doesn’t work we will have to upgrade some parts of it and test the product well to make sure everything still works. It’s a slow process. And it’s just going to get slower and slower the longer we keep going like this.”

That the product’s design had become cancerous and the underlying architecture had gone out of date were not considered a crisis –- but not being able to rapidly add new features sure was. It focused the company’s entire attention. Their response was to code up a “next generation” product from scratch, which was a disastrous idea for a whole bunch of reasons beyond the point of this story. When the dot-com bubble burst in 2001-2002, they had not yet successfully launched the next-generation product, and they still couldn’t add features to the old product fast enough. Revenue fell precipitously. Quarterly layoffs began, but it was not enough to keep the wolves from the door. That once-promising company was sold; the company that bought it outsourced development to China.

More recently I went to work for another promising software company. They had been in business for about a decade and had sold their software to a number of very large companies. But in the couple years before I’d been hired, the pace of new feature delivery had slowed to a crawl. Adding new features had become increasingly difficult and always broke existing features. As a result, it took longer and longer to test the product, but even then, major bugs were still being delivered to customers. Meanwhile, younger, more nimble competitors were stealing business away from us. As the rate of new revenue decreased, support costs skyrocketed. It was unsustainable, and that company, too, had to sell itself to another company to avoid collapse.

It was much the same story: the company had focused entirely on rapid new-feature delivery and not enough on ongoing design and architecture. After a decade, their soil had gone infertile and the code had become tangled. Nothing new would grow.

♦ ♦ ♦

Software as a garden: to be able to grow more software, to be able to grow revenue with it, you have to keep the soil fertile and give the roots room. The problem is, gardening projects are a hard sell. These are things like refactoring older parts of the code that no longer serve efficiently, or upgrading or replacing outdated parts of the architecture, or redesigning subsystems that work fine today but can’t adapt to things the company wants to do in the future. When you tell executives you need to do these things, what they hear is that they can’t have new features while you do it. New features fuel growing companies.

But if you don’t tend your garden, sooner or later it will stop producing.

Categories
Quality The Business of Software

Don’t piss off your users by suddenly changing your UI

By Jim Grey (about)

Delivering software on the Web is great. Especially with continuous delivery, we can deliver changes large and small anytime we want. And then we can get quick feedback from our users and the market, adjust the software accordingly, and push those updates fast, too. It’s utopia and the Holy Grail rolled into one!

Except that users are not very excited when we change things. They want software to stay as it is. Well, mostly: they want us to fix the bugs that affect them, of course, or even to add this or that little feature. But please, they plead, don’t make it work differently than it does now.

Meanwhile, we face various pressures. Markets shift; new needs emerge and old needs become less important. Technologies shift; old frameworks become outdated, new frameworks enable us to keep pace. Today everything has to not only work on mobile, but feel native to mobile — and all run on a single codebase. This is shifting product direction across our industry.

That’s the backdrop against which WordPress, the content engine behind one out of every four Web sites, rolled out a new editor last week. It was part of a complete rewrite of all of WordPress.com. Their old technologies just couldn’t stretch to where the world was moving. So they threw it out and started from scratch. Their new editor is fast — fast! — and works fluidly, while looking great both in my browser and on my phone.

2015-11-22_2157
Spanking new editor in my browser…

But boy, were users pissed. Pissed! Check the WordPress.com forum: 19 pages of complaints and counting. Sometimes, I swear, users wouldn’t be happy if you sent them gold bars, because they preferred the silver bars you used to send them. But very often users have a point: they’ve gotten into the swing of your software, and now you’ve changed it and they have to learn it all again. Worse, maybe now something they used to be able to do isn’t there anymore, or if it is, they can’t find it.

IMG_4117
…and on my phone

For the record, I was the first commenter on that thread, because I experienced some of those frustrations. I tried to be kind, but several features I use either went missing or were accessed in a way that I couldn’t easily discover. Argh! And I wished the editing space were wider; it felt awfully cramped. I wasn’t alone in any of these complaints.

I wanted to just edit a post. I didn’t want to learn a new interface. But I found that there’s no way to just revert to the previous editor. It is simply gone.

I understand what drives changes like this and know that this is a monumental achievement this is for WordPress. Still, because I’m a heavy WordPress user, more than anything else I feel frustrated. The new editor breaks all of my usage flows, and I’m having to rediscover everything. I didn’t want this.

It’s the same, by the way, with Microsoft Office’s ribbon, which replaced an older menu structure way back in Office 2007. That’s forever ago in software terms. Yet there are still features I can tell you exactly how to find in those old menus, but I have to Google where they are on the ribbon.

Users don’t give a rip about your business or the future of technology. They use your product to accomplish a thing. As long as they can consistently and easily accomplish that thing, they stay happy. Many users learn your product’s nuances and become quite adept with them. When you suddenly change the UI and all of their flows are interrupted, of course they’re frustrated.

So what would happen if you followed Basecamp’s model? Their software helps companies small and large manage projects. Last month, they released Basecamp 3, a ground-up rewrite — yet they received not a single complaint from existing users. That’s because Basecamp 2, and for that matter Basecamp 1, remain fully active. Existing users can upgrade if they want, or stay put if they don’t. There are compelling reasons to move to Basecamp 3. But if you’re a happy Basecamp 1 or 2 user, those products will be there, fully supported, for as long as you want to use them.

Maybe your company can’t do that. But what can you do to ease the transition for your users, so they can stay fully productive? Think this through. It’s more important than any technology or implementation decision you make.

Fortunately, WordPress does, for some reason, still provide back-door access to an even older editor.

2015-11-22_2158
Outdated but highly functional classic editor

I don’t care that this is based on outdated technology: it’s fully featured, and I know how to make it sing. I cut my blogging teeth on this editor when I started my personal blog in 2007. I’ve written over a thousand posts in it. I hope it never goes away.

Categories
Quality Testing

Fast failure recovery lets you take more risk and increase speed

By Jim Grey (about)

I can just imagine the “oh no” moment that rippled through Feedly last week when they learned that their big new release for iOS 9 crashed on launch for iOS 7 and 8 users. Their first response: add a hastily-written warning to the App Store description.

IMG_4066

Their second response: fix the bug, fast. It was ready the next day.

IMG_4069

Apparently, Apple has a way for developers to rush the very occasional critical fix into the App Store, sidestepping the normal, weeks-long approval queue.

Apple’s App Store fast track provides a safety net, and I don’t blame them for using it. Because Feedly could recover fast, hardly anybody will remember this gaffe, and it won’t cost the company very much.

I actually feel slightly bad talking about it, because it keeps the memory of this bug alive. But it illustrates such a simple equation: the faster you can recover from failure, the less perfect your product has to be when you ship it, and therefore the less expensive it is to build and support, and the faster your company can move.

I’m not advocating for delivering buggy product on purpose. Follow good development practices and test for important risks to deliver the best software you can as fast as you can. But when you inevitably deliver a bad bug, being set up to recover fast means you can deliver without worry.

I remember when the best way to get software to users was to mail it to them on CDs. A bug of this magnitude was a much bigger deal then. It was harder to warn users of the problem, and lots slower and more expensive to correct it. In that world, Feedly’s bug could have damaged their reputation for a long time. So back then it made sense to test more thoroughly — and therefore spend more time and money before releasing.

But today, in a world of Web software and 24-hour emergency App Store turnaround, you’ll deliver faster and with less expense when you set yourself up for fast failure recovery. Continuous integration and continuous delivery are usually a part of that strategy.

Categories
Quality Testing

Myths of test automation – debunked!

By Jim Grey (about)

wrote a post last year criticizing test automation when it’s used to cover for piles of technical debt and poor development practices. But I still think there’s a place for automation in post-development testing. There are two keys to using it well: knowing what it’s good at, and counting the costs. Without those keys it’s easy to fall prey to several myths of test automation. I aim to debunk them here.

Myth: Automation is cheap and easy

It is seductive to think that just by recording your manual tests you can build a comprehensive regression-test suite. But it never seems to really work that way. Every time I’ve used record and playback, the resulting scripts wouldn’t perfectly execute the test, and I’ve had to write custom code to make it work.

St. Paul's Episcopal Church

What I’ve found is that it takes 3 to 10 times longer to automate one test than to execute it manually. And then, especially for automation that exercises the UI, the tests can be brittle: you have to keep modifying scripts to keep them running as the system under test changes.

I’ve done straight record and playback. I’ve created automated modules that can be arranged into specific checks. I’ve led a team that created tests on a keyword-driven framework. And I currently lead a team that writes code that directly exercises a product’s API. The amount of maintenance has decreased with each successive approach.

A side note: given the cost of automating one test, can you see that you want to automate only what you are going to run over and over again, because otherwise the investment doesn’t pay?

Myth: Automation can test anything, and is as good as human testing

Automation is really good at repeating sets of actions, performing calculations, iterating over many data sets, addressing APIs, and doing database reads and writes. I love to automate these things, because humans executing them over and over is a waste of their potential.

This gets at a whole philosophical discussion about what testing is. I think that running predetermined scripts, whether automated or not, is just checking, as in, “Let me check whether clicking Save actually saves the record.” This subset of testing just evaluates the software based on predefined criteria that were determined in the past, presumably based on the state of the software and/or its specification or set of user stories as they were then.

The rest of testing involves human testers experimenting and learning, evaluating the software in its context now. This is critical work if for no other reason than the software and its context (environment, hardware, related software, customer needs, business needs, and so on) changes. An exploring human can find critical problems that no automated test can.

I want human testers to be free to test creatively and deeply. I love automated checks because they take this boring, repetitive work away from humans so they have more time to explore.

Myth: When the automation passes, you can ship!

It’s seductive to think that if testing is automated, that passing automation is some sort of Seal of Approval that takes out all the risk. It’s as if “tested” is a final destination, an assurance that all bets are covered, a promise that nothing will go wrong with the software.

But automation is only as good as its coverage. And if nobody outside your automation team understands what the automation covers, saying “the automation passed” has no fixed meaning.

It’s hard to overcome this myth, but to the extent I have, it’s because as an automation lead and manager I’ve required engineers to write detailed coverage statements into each test. I’ve then aggregated them into broad, brief coverage statements over all of the parts of the software under test. Then I’ve shared that information — sometimes in meetings with PowerPoint decks, always in a central repository that others can access and to which I can link in an email when I inevitably need to explain why passing automation isn’t enough. Keeping this myth at bay takes constant upkeep and frequent reminders.

Myth: Automation is always ready to go

Hope Rescue Mission

“Hey, we want to upgrade to the next version of the database in the sandbox environment. Can you run the automation against that and see what happens?”

My answer: “Let’s assume I can even run the automation in sandbox. If it passes, what do you think you will know about the software?” The answer almost always involves feelings: “Well, I’ll feel like things are basically okay.” See “When the automation passes, you can ship!” above.

Automation is software, full of tradeoffs aimed at meeting a set of implicit and explicit goals. Unless one of those goals was “must be able to run against any environment,” it probably won’t run in sandbox. The automation might count on particular test data existing (or not existing). It might not clean up after itself, leaving lots of data behind, and that might not be welcome in the target environment. It might depend on a particular configuration of the product and its environment that isn’t present.

Even in the environment the automation usually runs in, it might not be ready to go at a moment’s notice. Another goal would need to be, “must be able to run at any time.” There are often setup tasks to perform before the automation can run: a reset of the database the automation uses, or the execution of scripts that seed data that the automation needs.

Myth: Just running the automation is enough

When I run automated tests, part of me secretly hopes they all pass. That’s because when there’s a failure, I have to comb through the automation logs to find what happened, figure out what the automation was doing when it failed, and log into the software myself and try to recreate the problem manually. Sometimes the automation finds just the tip of a bug iceberg and I spend hours exploring to fully understand the problem. Some portion of the time, the failure is a bug in the automation that must be fixed. When it’s a legitimate product bug, then I have to write the bug in the bug tracker.

I am endlessly amused by how often I’ve had to explain that just running the automation isn’t the end of it: that if there are any failures, the automation doesn’t automatically generate bug reports. The standard response is some variation of “What? …ohhhhhh,” as it dawns on them. So far, thankfully, it has always dawned on them.

Myth: Automated tests can make up for years of bad development practices

I’ve just got to restate my point from my older post on this subject. If your development team doesn’t follow good practices such as writing lots of automated unit tests (to achieve about 80% code coverage), code reviews, paired testing, or test-driven development, automation from QA is not going to fix it. You can’t test in quality — you have to build it in.

If you’re sitting on a messy legacy codebase, one where your test team plays whack-a-mole with bugs every time you make changes to it, you are far, far better served investing in the code itself. Refactor, and write piles of automated unit tests.

You want on the order of magnitude of thousands of automated unit tests, hundreds of automated business-rule tests (which hopefully directly exercise an API, rather than exercising a UI, for resiliency and maintainability), and tens of automated checks to make sure the UI is functioning.

I’ll belabor this point: Invest in better code and better development practices first. When you deliver better quality to QA, you’ll keep the cost of testing as low as possible and more easily and reliably deliver better quality to your customers and users.

Categories
Career

Twenty-five years in the software salt mines

By Jim Grey (about)

Tomorrow it will have been 25 years since I started my career in the software industry.

It might seem odd that I remember the day only until you know that I started work on Monday, July 3, 1989, making my second day a paid holiday. The office was nearly deserted on my first day. My boss regretted not having me start on July 5 so he could have had an extra-long weekend too.

I was 21 years old when I joined that little software company in Terre Haute. I’m 46 now. I have worked more than half my life in and around the software industry.

taught myself how to write computer programs when I was 15. When I was 16, my math teacher saw some of my programs and praised my work. He encouraged me to pursue software development as a career. He began to tell me about this tough engineering school in Terre Haute.

I graduated from that tough engineering school hoping to find work as a programmer. Jobs were hard to come by that year, so when a software company wanted to hire me as a technical writer I was thrilled just to work. And then it turned out I had a real knack for explaining software to people. I did it for twelve years, including a brief stint in technology publishing and five years managing writers.

I then returned to my technical roots, testing software and managing software testers. I learned to write automated functional and performance tests – code that tests code – and it has taken me places in my career that I could never have imagined.

Office
My office at one of my career stops

I’ve worked for eight companies in 25 years. The longest I’ve stayed anywhere is five years. I left one company in which I was a poor fit after just 14 months. I’ve moved on voluntarily seven times, was laid off once, and was fired and un-fired once (which is quite a story; read it here). Changing jobs this often isn’t unusual in this industry and has given me rich experience I couldn’t have gained by staying with one company all this time.

I’ve worked on software that managed telephone networks, helped media buyers place advertising, helped manufacturers manage their business, run Medicare call centers, helped small banks make more money, enabled very large companies to more effectively market their products, and gave various medical verticals insight so they can improve their operations and their business.

Some of these companies were private and others were public; so far, I’ve liked private companies better. Some of them made lots of money, some of them had good and bad years, and one of them folded. Some of them were well run and others had cheats and liars at the helm. Some were very difficult places to work, but those were crucibles in which I learned the most. Others have brought successes beyond anything I could have hoped for a quarter century ago.

I did, however, hope for a good, long run in this industry, and I got it. But I’m also having a hard time envisioning another 25 years. It’s not just because I’d be 71 then. I really like to work, and – right now at least – I plan to do so for as long as I am able. But I’m starting to have trouble imagining what mountains I might yet climb in this career. Maybe that’s part of reaching middle age – indeed, many of my similarly aged colleagues, some with careers far beyond mine, have gone into other lines of work. I’m still having a lot of fun making software, though. I currently manage six software testers, one test-automation and performance-test developer, and one technical writer. I get to bring all of my experience to bear, and encourage my teams to reach and grow. I don’t want to stop just yet.


If this sounds familiar, it’s because it’s an update of an eariler post. Cross-posted to my personal blog, Down the Road.

Categories
Quality Testing

When test automation is nothing more than turdpolishing

By Jim Grey (about)

I used to think that writing a fat suite of automated regression tests was the way to hold the line on software quality release over release. But after 12 years of pursuing that goal at various companies, I’ve given up. It was always doomed to fail.

In part, it’s because I’ve always had to automate tests through a UI. When I did straight record-and-playback automation, the tests were enormously fragile. Even when I designed the tests as reusable modules, and even when I worked with a keyword-driven framework, the tests were still pretty fragile. My automation teams always ended up spending more time maintaining the test suite than building new tests. It’s tedious and expensive to keep UI-level test automation running.

But the bigger reason is that I’ve made a fundamental shift in how I think about software quality. Namely, you can’t test in quality – you have to build it in. Once code reaches the test team, it’s garbage in, garbage out. The test team can’t polish a turd.

Writing an enormous pile of automated tests through the UI? Turdpolishing.

I’ve worked in some places where turdpolishing was the best that could be done. Company leadership couldn’t bear the thought of spending the time and money necessary to pay down years of technical debt, and hoped that building out a big pile of automated tests would hold the line on quality well enough. I’ve led the effort at a couple companies to do just that. We never developed the breadth and depth of coverage necessary to prevent every critical bug from reaching customers, but the automation did find some bugs and that made company leadership feel better. So I guess the automation had some value.

But if you want to deliver real value, you have to improve the quality of the code that reaches your test team. Even if the software you’re building is sitting on a mountain of technical debt, better new code can be delivered to the test team starting today. I’m a big believer in unit testing. If your software development team writes meaningful unit tests for all new code that cover 60, 70, 80 percent of the code, you will see initial code quality skyrocket. Other practices such as continuous integration, pair programming, test-driven development, and even good old code reviews can really help, too.

But whatever you do, don’t expect your software test team to be a magic filter through which working software passes. You will always be disappointed.

Categories
Project Management

Three-point estimation is a critical skill you should build right now

By Jim Grey (about)

Maybe you’re one of the lucky ones.

Madison Bank clock

Maybe you work in a shop that executes scrum well. Your team keeps the backlog well groomed, understands its velocity, and delivers working software sprint after sprint. Or maybe you work in an open research-and-development shop where the journey is as important as the result, and projects are always open-ended.

If, like me, you have to deliver software on deadline using a less-than-perfectly executed methodology, this post is for you. It’s going to give you a technique for knowing where you stand against your deadlines – a very useful skill that will reduce the chaos you experience and help your boss steer projects more successfully, both of which are good for you.

It is a method for implementing continuous reestimation, which I wrote about in this post. Read it to know why this will make you and your boss more effective.

Teams that have worked for me and have estimated using this technique have seen their projects come in within estimate about 80% of the time. The other 20% of the time, they learned important things about their work that let them estimate more accurately the next time.

Here’s the technique:

  1. Break down your work into tasks.
  2. Estimate the tasks using a three-point formula.
  3. As you work, reestimate every day.
Break down your work into tasks

You might resist this step because in your core you know you will miss some tasks. It’s okay. This process honors that you don’t know what you don’t know. Just outline the tasks as best you can.

But think small, because you’ll be more accurate when you estimate more small things than fewer big things. Break things down as small as you reasonably can.

Estimate the tasks using a three-point formula

For each task, then, think of the following:

  • If everything goes right on that task, how long will it take? This is the best case, B.
  • If everything goes wrong on that task, how long will it take? This is the worst case, W.
  • If a reasonable amount of things go wrong on that task, how long will it take? This is the likely case, L.

Think for a minute or two about each task, but any more than that is probably overthinking it.

Important: Calculate your estimates in hours. Not days, not weeks, hours. There’s too much slush in days and weeks. Besides, in an 8-hour day you probably work just 6 hours on project tasks. You spend the other two hours in meetings, having hallway conversations, using the restroom, and being interrupted.

Here is where some magic comes in. Just go with me on this. For each task, drop B, W, and L into this formula and calculate the answer.

\displaystyle Estimate=\frac{B+4L+W}{6}

You’re weighting the likely case, but honoring the best and worst cases – and in one deft motion you take fear out of your estimates. What makes estimates bad is either unexpected things happening or overpadding for unexpected things happening. Thinking best/likely/worst helps you be more objective. And when you add up these three-point estimates for all your tasks, you get a reasonable cushion for the things that could go wrong.

Divide the number of hours by the number of hours per day or week you will work on the project. Lay that number of days out on a calendar; tell your boss that’s when you’ll be done.

As you work, reestimate every day

You are going to discover tasks you didn’t think of at first. And some tasks are going to take longer than you originally thought. When this happens, estimate the new tasks and/or reestimate the task that’s taking longer. Recalculate your end date and break the news to your boss.

Your boss ought to kiss you right then and there, by the way, because you gave him or her information early enough to do something about it – adjust scope, add resources, move the date, or tell you that unfortunately you are looking at some late nights. Hopefully the answer will be a, b, or c more often than d.

And every time you have to reestimate, you’ve learned something about your work that helps you estimate more accurately in the future.

Categories
Process Project Management Quality

If you want to ship software, stay in touch with how much you suck

By Jim Grey (about)

My colleague Matt Block recently posted on his blog a link to an article about how a software shop’s business model affects how well agile scrum works for them. It breaks business models down into emergent, essentially meaning that the company builds product to meet goals such as selling ads or driving traffic, and convergent, essentially meaning that the company builds product that directly serves a target market. The article argues that agile is made for emergent and is a poor fit for convergent. That’s just a sketch of the article; go read it to get the full flavor.

Eminence says: Monrovia Sucks!
Graffiti found in the town neighboring Monrovia

I’ve always worked for companies following convergent business models. We’ve made our money by selling the software we created, which made it always important to deliver a certain scope by a certain time. When those companies implemented agile scrum, they could never fully adapt a key principle of it: when it’s time to ship, you ship whatever is built. In a convergent world, scope is king; you ship when everything specified is built.

I e-mailed my brother, Rick Grey, a link to this article. It’s great to have a brother who does the same thing I do for a living as we can talk endlessly about it. I thought we’d have a conversation about how to scope an agile project, but instead he had a brilliant insight: What if agile is good for convergent-model companies because it tells you sooner how much your project is off track? He gave me permission to share his e-mailed reply, which I’ve edited.

– – –

What if the companies we’ve worked for and all the other convergent-model teams of the world are doing agile just fine? By “just fine” I mean “as good as they do waterfall,” which may not be “just fine,” but we’ll get to that in a minute. Meanwhile, consider:

Long waterfall project:

  • No one pays real attention to progress (there’s always next month to catch up)
  • Engineers go dark, checking out huge sections of the codebase and not merging them back for long periods
  • Engineers (who are notoriously poor estimators) claim 50% done when it’s really about 25% – and then, as the code-complete milestone nears, they (usually innocently) claim 90% done when it’s really 70%
  • A couple of days before the code-complete milestone, engineering finally acknowledges they won’t hit the milestone and delays delivery to QA – “but we’re 95% done, for sure”
  • Under the pressure of already having missed a deadline, developers quietly take shortcuts to make it possible to hit the new QA delivery date
  • Weeks and months of unmerged changes come crashing in, creating conflicts and compile/deploy problems, further delaying delivery to QA
  • QA, now staring with a multiple-week handicap on an already-too-aggressive schedule, quietly takes its own shortcuts
  • QA finds hairy showstopper bugs and so the ship date gets moved
  • Management is livid, so QA goes into confirmatory testing mode just to get it out the door

Agile project of the same size:

  • Much of the above happens at a smaller scale, one iteration at a time
  • You fail to deliver everything planned starting with the first sprint
  • Instead of spending 80% of the project thinking you don’t suck as an organization and the last 20% realizing that you do, agile lets you feel like you suck every step of the way
  • Takeaway for management: “agile sucks” and/or “we suck at agile”

I assert that most teams are bad at delivering under a convergent business model. The hallmark pathologies of software delivery under a convergent model are too numerous and powerful for most teams to overcome, but their struggles are masked by waterfall until the end. Agile surfaces the problems every iteration. You feel like a loser by week 4 instead of week 40.

But this is actually a win. You get better project visibility and a tighter feedback loop, meaning you’ve got a better chance to make adjustments earlier to get the most out of your team you have. Embrace the feedback loop as a chance to make things better, and learn not to view it as proof of how much you (collectively) suck.

– – –

I will add that agile also helps you keep resetting expectations within your organization, because it makes it standard practice to keep reestimating what it will take to finish everything. This is just what I was talking about in my last post (read it here).