Wednesday, May 30, 2018

What being a rubber duck taught me about problem solving

Long ago, before I discovered the web, (yes, there was a time before the Internet had entered everyone's lives) I was working as a Mechanical Engineer in training.

One day, I get called to the boss' desk (open plan happened before the internet), and he's got this problem. Thousands of steering wheels we manufactured at a plant 600KM away weren't working.
"Not working, how can a steering wheel… not work?" I asked, puzzled.
"Well, they're making a noise" says the boss. "something about a squeaking noise when turning the wheel. They want to send the whole batch back to the plant, scrap the lot, get another batch and/or fine us".

When you're building cars, steering wheels are serious business, I guess. The whole car plant would stop, wait for the new batch, and lose money, because every 11 minutes, a car was rolling out of the doors, to fulfil an order. Everyone up the entire chain was going to get hit.
I had to go to the customer's plant, about 50KM away, and solve the problem. In retrospect, sending a junior trainee wanna-be engineer in training to the plant was probably just an exercise in demonstrating action to the client, whilst they flew the REAL experts in. This, in itself is a clever move, but I digress…

I end up at the plant. Everyone's tense. The customer has decided to keep producing vehicles, in the hope that at least some will be OK, and/or that the problem, when found, could be fixed relatively easily. In the meantime, they've got cars, filling up their buffer parking-lot, out there in the hot African sun, not earning any money. The Lead Engineer in charge of QA takes me out to the lot, and I see something like 200 cars there. All are the luxury, high-end model. Popular seller. Reputation builders. And now, something is wrong…

Giving me a bunch of keys, he tells me to check out any cars I want. They all behave the same. There's a number on a tag attached to each key, and the corresponding number, written in big, angry letters in white chinagraph pencil on the windscreen. He turns, shoulders slumped toward the plant. "I'll be in the QA office" he shouts over his shoulder. He's aware of the theatre here, and doesn't have the energy or time to participate. Likely he's going back to the office to work on his CV or cry or something.

Swearing mentally in all the languages I know, fighting nausea and the urge to run away, I climb in the first car. I've never even been in one before. To this day, I still don't actually own one. Starting it up, it purrs like a kitten. All I can hear is the faint wheeze as I move around in the fancy leather seats, looking for controls and levers. Turning the wheel, I hear a faint (very faint) scratching noise. "If this is the problem, rich folks need to get over themselves" I think. Driving the car forward, starting to do laps around the parking lot, I come up to a left turn.

As I start turning, as the wheel is turning, a noise like fingernails scraping down a blackboard hits me. I'm not going to lie: If I got this car for free, I'd rather rip out the steering wheel and use some pliers to turn the wheel than have to deal with this. It's that bad. "Jesus" I think. "We're hosed".

Every car, same thing. Same level of noise. Whether I'm turning left or right, same awful noise. I can't find any binding between the wheel and the case around the shaft - there could be a defect in either the casing or the wheel, and maybe things are touching and rubbing. This is not the case. Maybe (inconceivably, but still checking) there's a piece of wire or metal shavings or something that haven't been cleaned off the wheel or housing that are rattling around or something in there. Not happening. Absolutely no inspection I make from any angle (on the outside) shows anything that doesn't make sense.

Sweating, I close the door on the tenth or fifteenth car I check. It's been about 2 hours now. Time to head over to the QA office.

Getting back to the office, I find the engineer, not enjoying lunch in the canteen, but with schematics for the steering wheel spread out over his desk, micrometer in hand, checking a handful of wheels.
"Definitely a noise, only when I turn, no obvious cause" I say. "Yup - we knew that 3 days ago, junior" he says with his eyes.

Nothing to do but scrap the batch and try again.

In desperation, I say: "Could you take me through the part of the line where the steering wheel is attached?". A flicker of "Really, are we REALLY going to do this" over the guy's face, and "Sure. Why not" he says, shrugging. It's not like anyone's going anywhere.

We head over to the line. The car is kind of assembled, dashboard and steering housing in place. There's a guy there, taking the wheel, running it through a series of different sub-stations, and then attaching it to he steering column with a great big lock nut. The QA guy starts his patter: School kids come through here on tours every couple of months, and he begins to settle into the rhythm of it.
"First" he says, "the technician attaches the contact plate. This is part of the circuit for the hooter." The tech attaches a galvanised plate using some screws.

"Then" he continues, "the technician attaches the wires going to he other hooter buttons". There's a big central push point, and then two auxiliary buttons on the arms of the wheel. "This allows the driver to hoot using any one of the three places on the wheel" he says, pointing to the places on the wheel where this is possible.

"He then attaches the wires for the contact plate to the pressure plate". The QA guy points to a sort of plate thing with springs on it. It allows the driver to aggressively hit the main, big hoot area in the center of the wheel, forcing the contact plate down, but not crushing it.
The tech then takes the wheel to the half-car, and puts the wheel in place, ready to use the pneumatic tool to attach the lock nut.

The QA guy pauses.

"Peter" he says to the tech. "What's going on? - you're supposed to grease the contact pin".
Peter, half talking over his shoulder says "We ran out about a week ago, but it's fine. The hooter still works". He attaches the wheel, presses the cover for the bolt in place, so the wheel looks like it would when you or I are using it, attaches some crocodile clips to some wires dangling out of the dashboard assembly, and presses the central, then each of the auxiliary buttons. A big red light on a console off to the side of the assembly line blinks, indicating a circuit closing.

As if to cut off any questions about blackboard-nail-sounding-scraping, Peter rocks the wheel back and forth, looking the QA guy in the eye. "No noise" he says, satisfied.

QA guy isn't happy. This is clearly wrong. He tells Peter to come with us, and bring a screwdriver and manual socket wrench. We're going to the yard. "But, the line…" starts Peter. If he leaves, it stops. QA guy smiles. "It's OK, I think we can let things go a bit".

In the yard, we go to one of the cars I've just driven. Peter pulls off the cover, takes the bolt off, and pulls the wheel out. The contact pin has no grease. QA guy pulls a stick of lip balm out of his pocket, gives the pin a few dabs, and Peter reattaches the whole thing. We all climb in the car, and go for a spin. No more noise. Peter is shocked. Somewhere between the floor and the lot, whatever grease from the production process has worn off, leaving metal to rub on metal, causing that awful sound.
QA guy turns to me a bit embarrassed. "Thanks. We've got it from here. Looks like we found the problem. Nice work." He doesn't look too thrilled about it, but his body language is telling me that there's hope: people won't be out of a job anytime soon, they'll get it sorted in a day, and things can carry on as usual.

Years later, in the software development world, with "rubber duck debugging" being a thing, the lesson is simple: rubber duck debugging works, because it forces you to pay very close attention to everything. You have to reexamine your assumptions. You have to start from a place of "rethink everything".

I believe that is probably the single most useful idea to hold on to when trying to figure out what is going on in a piece of code that is misbehaving.