Computers Are Hard: hardware with Greg Kroah-Hartman

Wojtek Borowicz
Computers Are Hard
Published in
10 min readSep 27, 2020

--

A printer is a very complex thing.

Illustration showing a microphone, printer, and a computer mouse ‘speaking’ in binary code.
Hardware with Greg Kroah-Hartman. Illustration by Gabi Krakowska.

Once, I was troubleshooting with a customer adamant desktop notifications from our app weren’t firing for him. Standard stuff. Usually, the answer would be misconfigured settings or the operating system interfering. Except I ran a diagnostic and the output was crystal clear: notifications were showing up fine. No errors, all green across the board.

I threw everything but the kitchen sink at that case. After a few days of back and forth with the customer (bless your patience, sir) I finally got it. His laptop monitor and secondary display had different resolutions and scaling settings. Notifications were supposed to show up on the external monitor but because of a bug, the app was rendering them based on the laptop’s screen settings. The logs told us everything was fine because we were, indeed, showing the notification. A few centimeters beyond the edge of the screen.

When software meets the messy, physical reality of physical devices, fuckups like this are bound to happen. But at the end of the day, hardware mostly works. Webcams mostly work. Mice and keyboards mostly work. Notifications mostly display in a part of the screen that actually exists. Printers… fine, these are fifty-fifty.

I asked Greg Kroah-Hartman to tell me about the work that goes into making computer peripherals do — mostly — what we ask them to. Greg is the maintainer of the Linux kernel’s stable releases and an author of books about writing Linux drivers. He took me on a journey from a tiny processor embedded in a mouse to deep inside the guts of an operating system.

Oh, and he explained printers to me, too.

Wojtek Borowicz: Let’s start with the very basics. I slide a mouse across the desk and it makes the cursor on my monitor move. How does that even happen?

Greg Kroah-Hartman: Oh wow, this is an operating system interview question if I have ever seen one. To be up front, this is not basics. A simple task like this shows a lot about how systems work these days. I’ll make some assumptions to make it easier: mouse is a USB mouse, not serial, Bluetooth, PS/2, or whatever. Your system is running Linux. I will not go above the operating system level in much detail, as that’s where my knowledge gets blurry.

First off, there is a tiny processor in your USB mouse. The code it’s running is very small and compact. It’s responsible for two things: reading the state of the mouse movements and button clicks, and responding to the computer when it’s asked if it has done anything different since the last time it was asked. One of the main goals when the USB protocol was created was so that a mouse could be made for less than $1. Because of this, a processor that controls a mouse can be made very cheaply and all of the harder computations involved in dealing with mice are done by the operating system: Linux in our case.

Before the kernel can ask the mouse for data, it has to know that a device is even plugged into it. In the old days objects that were plugged into systems had to be configured so that the system knew what type of device they were, where they were plugged into, and what type of protocol the device ‘spoke’. The goal of USB was to try to unify all of that and create a standard that grouped common types of devices together to speak the same way, and create a way for the host system to ask ‘what type of device are you?’ As part of the USB specification process, a huge number of common devices were defined, such as mice, keyboards, disk devices, video cameras, foot pedals, electronic scales, and so on, such that any manufacturer could create a device that spoke the same type of protocol and no custom code would have to be written on the host system. Once the code was written for the operating system to talk to one USB mouse, all USB mice that followed the specification would instantly work. That was a huge step forward in standardization of devices and has done more to make systems easier to use than almost anything else in the past few decades.

Anyway, back to our mouse. The host system now knows a mouse is plugged into it, so it will go and ask the mouse every few milliseconds or so: ‘do you have any more data for me?’ If it does, it converts that data into a standard form that can be used by programs, and then exposes it to user space. In the case of a mouse, the data is usually a simple ‘I have moved in the X direction so many units, and in the Y direction so many units and button N is now pressed (or released)’.

A user space program (a program that lives outside of the system’s kernel. Every app you use runs in the user space) is running and has either told the operating system ‘wake me up when a mouse has sent you data’, or asks at regular intervals ‘do you have any more mouse data for me?’ The operating system replies to the program, which then converts the data into another unified standard and provides it to the program that wants to represent the mouse pointer on the screen. Drawing that mouse pointer on the screen is a whole other set of sequences that are much more complex than the mouse data pipeline, due to different hardware protocols that are not standardized in places.

KERNEL

One of the most important parts of an operating system is its kernel. It manages the communication between hardware and software and allocates memory to other software running in the system.

So when you have different computer peripherals, be it keyboards, mouses, printers and whatnot, they run their own software too?

It is very rare that any peripheral made in the past 10 years would not have a processor running software written for it. See the USB mouse example above. A keyboard has to have code written in it to scan all of the different keys to determine what is currently pressed, and be able to send that data to the host computer when it is asked for it.

A printer is a very complex thing. My first job out of college in the early 1990’s was writing software that was embedded inside printers that printed airline tickets and different types of packing labels. The software had to handle the data that described what text and barcodes needed to be printed and where on the page, as well as control the motors that fed the paper to the printer, monitor the sensors to verify that the paper was present and where it needed to be at that moment in time, drive the print head so it did not burn the paper incorrectly, talk to different chips that handled button pressing, time of day, different font cartridges, persistent memory, and much much more. There was an internal operating system controlling all of these tasks running at the needed speed in order to keep it all moving smoothly. Modern printers are even more complex, having to talk to wireless networks, handle scanning, and different apps written on the printer itself. Usually, Linux runs inside printers in order to make it easier for printer developers to focus on the things they need to do to make a printer work, instead of having to rewrite the basic things like ‘talk to USB’ or ‘talk to a network’.

A wireless printer from Brother.
A wireless printer. The most feared enemy of every office worker.

Does building your own device require writing a lot of custom code or is everything already baked into existing operating systems?

It all depends on what type of device you want to make. It is pretty simple to create your own keyboard these days such that it will ‘just work’, running open source code that talks the standard USB keyboard protocol. That is due to the standardization of many common types of devices.

But if you want to create something that has never been done before, yes, you will have to write custom code for the operating system to be able to control and talk to your device.

Speaking of operating systems, how different is writing hardware drivers between Linux, Mac, and Windows? If I developed Windows drivers for a printer, can I just convert them to Linux and OS X?

Hardware drivers are very different between different operating systems. Traditionally, writing a driver for Linux results in about one third less code than for other operating systems, due to the huge amount of common code that Linux provides for you to use. The fact that all drivers for Linux are contained directly within the main source tree of Linux has allowed us to see common features that multiple drivers use, and consolidate that code into functions that are outside of the driver, and provided by the operating system, making the driver much simpler and easier to write and maintain over time.

What devices are the most difficult to make to talk to a computer?

Custom ones that have never been done before, as no one has written the code for them yet.

Is building support for wireless devices any different than for wired ones?

In some ways yes, and in other ways no. Take a mouse again. The USB protocol for mice was called HID, which stood for Human Interface Device. Manufacturers realized that once this protocol was made and operating systems supported it, they could use the same communication protocol across other transport mediums. So if an operating system could add support for a new transport method, then it could instantly start talking to a device for which it already knew a different transport method. So while there is some plumbing involved in turning a USB mouse to a Bluetooth mouse, the data sent to the operating system to describe how the mouse is moving is the same.

Can different peripherals interfere with each other? Is it possible that for example my microphone isn’t working because of the webcam drivers? Or my headphones don’t connect because of the Wi-Fi adapter?

Hopefully not. For most hardware protocols these days, devices can not even see that there is any other device in the system at all. All they have the ability to do is to answer the simple question from the host system: ‘do you have any data for me?’

Drivers for specific classes of devices, or different types of custom devices, should not be able to see other devices in the system either, as they are not controlling them. It’s different in cases where some drivers control multiple things at once in order to get the device to work properly, but those are the exception, not the rule.

Similarly, is it likely apps would trip over individual devices? Imagine you built support for back and forward mouse buttons into an app and there’s one model of a mouse where it won’t work. How do you debug that?

The job of an operating system is to provide a unified view of all hardware to programs, so you should not have to worry about a different type of device. All you should need to focus on is: ‘did the mouse change position?’ But of course, hardware being hardware, there are loads of exceptions and ways that hardware designers can mess things up and do things differently either on purpose, or by accident.

Because of this, there are huge tables of hardware quirks that an operating system accumulates to smoothen things. But sometimes, for more complex devices, the operating system can not handle these differences, and so a user space library needs to get involved in order to figure things out and fix the data up. That is why there are common libraries that all programs have come to use in order to talk to devices like mice, so that they do not have to duplicate that logic in their own code. Those libraries do not live in the operating system, but are part of the low level plumbing that has been created around it. The specific library for mice and input devices that does this is called libinput.

LIBRARIES

In computing, a library is a tool in the form of pre-written code that handles a specific task. Engineers use libraries to avoid reinventing the wheel every time they build an app. Greg shared an example of libinput: a Linux library for interacting with input devices like mice, touchpads, and graphic tablets.

If you find a bug in device drivers, how do you get the fix to the users?

For Linux, you fix the driver and send a change to the owner of the driver and the development community for that subsystem. The change is reviewed by the developers and accepted by the maintainers of that subsystem, and then sent on to the kernel maintainer for inclusion in the next release. When the fix shows up in a public release, it can be backported to older stable releases of Linux at the same time.

Fixes like this happen all the time. We are averaging about 20–40 fixes a day for Linux at the moment that are being backported to the stable kernels. This is in contrast to the main development cycle of Linux, which is averaging about 9 changes an hour every day, adding new features and functionality for new things that people come up with.

We connect devices to computers and to each other through USB, USB-C, Lightning, HDMI… why do we have so many interfaces? Is there much difference between them?

There are lots of differences between them at the hardware levels in some ways, and in other ways, they all seem to work the same way on a physical layer (they use differential signaling) to transmit data across two wires at very high speeds. The data that is sent can be in a standard format (like to describe mice), or in a lower-level format, to emulate another type of device to make it look like it is directly connected to the main system over an older style of connection (i.e. PCI).

New form factors are created all the time in order to provide for higher speeds, increased distances, lower power consumption, and different design goals. HDMI fits a very different specific need than USB-C does. Interfaces are not just created for fun but to solve real issues.

--

--