BMOW title
Floppy Emu banner

Raspberry Pi GPIO Programming in C

The Raspberry Pi’s 40-pin GPIO connector often gets overlooked. Typical Pi projects use the hardware as a very small desktop PC (RetroPie, Pi-hole, media center, print server, etc), and don’t make any use of general-purpose IO pins. That’s too bad, because with a little bit of work, the Raspberry Pi can make a powerful physical computing device for many applications.

 
Raspberry Pi vs Arduino (and other microcontrollers)

Why would you want to use a Raspberry Pi instead of an Arduino or other microcontroller (STM32, ATSAM, PIC, Propeller)? There are loads of “Raspberry Pi vs Arduino” articles on the web, and in my view almost all of them miss the mark. The Pi is not a better, more powerful Arduino. It’s a completely different type of device, better at some tasks, but markedly worse at others.

The Pi is vastly more powerful than something like an Arduino Uno. The latest Pi 3 Model B+ has an 88x faster CPU clock and 500,000x more RAM than the Uno. It also runs a full-fledged Linux operating system, so it’s much easier to create projects involving high-level functions like networking or video processing. And you can connect a standard keyboard, mouse, and monitor, and use it as a normal computer.

But the Pi operating system is also a huge weakness in many applications. There’s no “instant on”, because it takes nearly a minute for the device to boot up. There’s no appliance-like shutoff either – the Pi must be cleanly shutdown before power is turned off, or else the operating system files may get corrupted. And real-time bit twiddling of GPIO is mostly impossible, because the kernel may swap out your process at any moment, making precise timing unpredictable.

In theory it’s possible to do bare-metal programming on the Raspberry Pi, eliminating Linux and its related drawbacks for real-time applications. Unfortunately this doesn’t seem to be a common practice, and there’s not much information available about how to do it. So the Pi is probably best for those applications where you need some major CPU horsepower and some kind of GPIO connection to other sensors or equipment, but don’t need precise real-time behavior or microsecond-level accuracy.

 
GPIO and Python?

If you start Googling for “Raspberry Pi GPIO programming”, you’ll quickly discover that most of the examples use the Python language. In fact, this seems to be the most popular way by far to use GPIO on the Pi.

I have nothing against Python, but I’m a C programmer through and through, and the idea of using a high-level language for low-level digital interfaces is unappealing. By one measure, Python is over 300x slower at Raspberry Pi GPIO manipulation than plain C. I’m sure there are applications where it’s OK to throw away 99.7% of potential performance, but I’ll be sticking with C, thank you very much.

I spent a little time researching four different methods of Raspberry Pi GPIO manipulation in C. This involved reading documentation and data sheets, and examining the source code of various libraries. I haven’t yet tried writing any code using these methods, so take my impressions accordingly.

If any of the authors of these C libraries happen to read this – thank you for your work, and please don’t be offended by any criticisms I may make. I understand that creating an IO library necessarily involves many tradeoffs between simplicity, speed, flexibility, and ease of use, and not everyone will agree on the best path.

 
Direct Register Control – No Library

The GPIO pins on the Raspberry Pi can be directly accessed from C code, similarly to how it’s done on the ATMEGA or other microcontrollers. A few different memory-mapped control registers are used to configure the pins, and to read input and set output values. The only big difference is that the code must first call mmap() on /dev/mem or /dev/gpiomem, to ask the kernel to map the appropriate region of physical memory into the process’s virtual address space. If that means nothing to you, don’t worry about it. Just copy a couple of dozen lines of code into your program’s startup routine to do the mmap, and the rest is fairly easy.

Here’s an example of reading the current value of GPIO 7:

if (gpio_lev & (1<<7))
  // pin is high
else
  // pin is low

Just test a bit at a particular memory address - that's it. This looks more-or-less exactly like reading GPIO values on any other microcontroller. gpio_lev is a memory-mapped register whose address was previously determined using the mmap() call during program initialization. See section 6 of the BCM2835 Peripherals Datasheet for details about the GPIO control registers.

Setting the output value of GPIO 7 is similarly easy:

gpio_set |= (1<<7); // sets pin high

gpio_clr |= (1<<7); // sets pin low

Using other control registers, it's possible to enable pull-up and pull-down resistors, turn on special pin functions like SPI, and change the output drive strength.

Watch out for out-of-order memory accesses! The datasheet warns that the system doesn't always return data in order. This requires special precautions and the use of memory barrier instructions. For example:

a_status = *pointer_to_peripheral_a; 
b_status = *pointer_to_peripheral_b;

Without precautions, the values ending up in the variables a_status and b_status can be swapped. If I've understood the datasheet correctly, a similar risk exists for GPIO writes. Although data always arrives in order at a single destination, two different updates to two different peripherals may not be performed in the same order as the code. These out-of-order concerns were enough to discourage me from trying direct register IO with my programs.

 
Wiring Pi

WiringPi wraps the Raspberry Pi GPIO registers with an API that will look very familiar to Arduino users: digitalRead(pin), digitalWrite(pin, value). It's a C library, but third parties have added wrappers for Python and other high-level languages. From a casual search of the web, it looks like the most popular way to do Raspberry Pi GPIO programming in C.

WiringPi appears to be designed with flexibility in mind, at the expense of raw performance. Here's the implementation of digitalRead():

int digitalRead (int pin)
{
  char c ;
  struct wiringPiNodeStruct *node = wiringPiNodes ;

  if ((pin & PI_GPIO_MASK) == 0)		// On-Board Pin
  {
    /**/ if (wiringPiMode == WPI_MODE_GPIO_SYS)	// Sys mode
    {
      if (sysFds [pin] == -1)
	return LOW ;

      lseek  (sysFds [pin], 0L, SEEK_SET) ;
      read   (sysFds [pin], &c, 1) ;
      return (c == '0') ? LOW : HIGH ;
    }
    else if (wiringPiMode == WPI_MODE_PINS)
      pin = pinToGpio [pin] ;
    else if (wiringPiMode == WPI_MODE_PHYS)
      pin = physToGpio [pin] ;
    else if (wiringPiMode != WPI_MODE_GPIO)
      return LOW ;

    if ((*(gpio + gpioToGPLEV [pin]) & (1 << (pin & 31))) != 0)
      return HIGH ;
    else
      return LOW ;
  }
  else
  {
    if ((node = wiringPiFindNode (pin)) == NULL)
      return LOW ;
    return node->digitalRead (node, pin) ;
  }
}

That's a lot of code to accomplish what could be done by testing a bit at an address. To be fair, this code does a lot more, such as an option to access GPIO using sysfs (doesn't require root?) instead of memory-mapped registers, and pin number remapping. It also adds a concept of on-board and off-board pins, so that pins connected to external GPIO expanders can be controlled identically to pins on the Raspberry Pi board itself.

From a brief glance through the source code, I couldn't find any use of memory barriers. I'm not sure if the author determined that they're not necessary somehow, or if out-of-order read/writes are a risk.

WiringPi also includes a command line program called gpio that can be used from scripts (or interactively). It won't be high-performance, but it looks like a great tool for testing, or for when you just need to switch on an LED or something else simple.

 
pigpio

pigpio is another GPIO library, and appears more geared towards simplicity and speed. And yes, it was quite a while before I recognized the name was Pi GPIO, and not Pig Pio. 🙂

Here's pigpio's implementation of gpioRead():

#define BANK (gpio>>5)
#define BIT  (1<<(gpio&0x1F))

int gpioRead(unsigned gpio)
{
   DBG(DBG_USER, "gpio=%d", gpio);

   CHECK_INITED;

   if (gpio > PI_MAX_GPIO)
      SOFT_ERROR(PI_BAD_GPIO, "bad gpio (%d)", gpio);

   if ((*(gpioReg + GPLEV0 + BANK) & BIT) != 0) return PI_ON;
   else                                         return PI_OFF;
}

Here there's no pin number remapping or other options. The function does some error checking to ensure the library is initialized and the pin number is valid, but otherwise it's just a direct test of the underlying register.

As with WiringPi, I did not see any use of memory barriers in the source code of pigpio.

 
bcm2835

bcm2835 is a third option for C programmers looking for a Raspberry Pi GPIO library. It appears to have the most thorough and well-written documentation, but also seems to be the least commonly used library of the three that I examined. This may be a result of its name, which is the name of the SoC used on the Raspberry Pi. It's somewhat difficult to find web discussion about this library, as opposed to the chip with the same name.

Like pigpio, bcm2835 appears more focused on providing a thin and fast interface to the Pi GPIO, without any extra options. Here's the implementation of bcm2835_gpio_lev(), the oddly-named read function:

uint32_t bcm2835_peri_read(volatile uint32_t* paddr)
{
    uint32_t ret;
    if (debug)
    {
		printf("bcm2835_peri_read  paddr %p\n", (void *) paddr);
		return 0;
    }
    else
    {
       __sync_synchronize();
       ret = *paddr;
       __sync_synchronize();
       return ret;
    }
}

uint8_t bcm2835_gpio_lev(uint8_t pin)
{
    volatile uint32_t* paddr = bcm2835_gpio + BCM2835_GPLEV0/4 + pin/32;
    uint8_t shift = pin % 32;
    uint32_t value = bcm2835_peri_read(paddr);
    return (value & (1 << shift)) ? HIGH : LOW;
}

The pin number is constrained to the range 0-31, but otherwise there's no error checking. The actual read of the GPIO register is performed by a helper function that includes memory barriers before and after the read.

 
Impressions

For my purposes, I would probably choose pigpio or bcm2835, since I prefer a thin API over one with extra features I don't use. Of those two options, I'd tentatively choose bcm2835 due to the format of its documentation and its use of memory barriers. I wish I understood the out-of-order risk better, so I could evaluate whether the apparent absence of memory barriers in the other libraries is a bug or a feature.

Any analysis that looks at just a single API function is clearly incomplete - if you're planning to do Rasbperry Pi GPIO programming, it's certainly worth a deeper look at the many other capabilities of these three libraries. For example, they differ in their support for handling interrupts, or byte-wide reads and writes, or special functions like SPI and hardware PWM.

Did I miss any other C programming options for Raspberry Pi GPIO, or overlooked something else obvious? Leave a note in the comments.

Read 20 comments and join the conversation 

20 Comments so far

  1.   - May 27th, 2018 8:49 pm

    About your first example of memory access order (“Watch out for out-of-order memory accesses!”): Are you sure that the contents of a_status and b_status can be swapped? Sure, the order of the actual accesses may be swapped, so peripheral_b may be read before peripheral_a, but if the contents got swapped, that would be completely broken IMHO.

    Another thing to watch out for is the ordering between read and write accesses, which is also not always guaranteed.

  2. Steve - May 27th, 2018 9:04 pm

    That example with a_status and b_status was copy-pasted directly from the Broadcom datasheet, page 7. I agree it seems very strange.

  3. Tim - May 28th, 2018 5:03 am

    Haven’t run across the swap problem as my projects tend to be small. I do like the “no library” or “custom library” approach as it minimizes what needs to be changed when porting code from (example) the RPi to a Cubietruck or BBB.

    One big annoyance is with the spec sheets themselves. Some are only available in Chinese or German. Some assume that you already know the shortcomings of the chip. If you’re working with someone’s breakout board, you have to dig to find out what parts of the spec sheet are N/A because the builder specifically hardwired them out (e.g., most of the available FM receiver breakout boards only support I2C (no SPI)). Etc.

  4. Steve - May 28th, 2018 8:00 am

    Yes, and speaking of spec sheets, I was surprised by the number of typos and general informality of the Broadcom datasheet for the BCM2835 peripherals (see link in the original post). Every other datasheet I’ve ever read was a very professional document, but this one is full of grammar errors like “each bank has its’ own interrupt line” and “it is theoretical possible”, as well as chatty side-comments like “Not a good idea!” I could understand if the grammar issues were English translations problems, but Broadcom is a US company, and the datasheet reads a bit like a hastily-written college term paper.

    As for datasheets that are only available in other languages, I’ve found that Google Translate does a surprisingly good job translating technical datasheets from Chinese. I was recently reading one such translated datasheet that kept referring to the “caterpillar effect”, which I assumed was some kind of amusing translation error. But in reality it’s a visual artifact of LED matrix refresh, and “caterpillar effect” is exactly what it’s called in English. Score Google Translate 1, me 0.

  5. asdf - May 28th, 2018 11:35 pm

    If peripherals are mmap’ed as regular memory (eg. via /dev/mem), you will get all the joys of caching and weak ordering. So memory accesses can indeed be reordered, you may end up reading/writing more memory locations than intended and so forth. There’s unfortunately no flag for mmap to map the memory as I/O (uncacheable, strongly ordered), you need a kernel driver for that. If there’s no existing driver, the Linux UIO driver lets you define a generic device in the device tree, without writing any code.

  6. John - May 29th, 2018 3:10 am

    “if the contents got swapped, that would be completely broken IMHO”

    I agree.

    Normally, out-of-order memory access just means that the example code might get peripheral a’s data from a later time than peripheral b’s, despite reading from it first. Most of the time that’s not a problem, so you don’t bother with barriers.

    But not in this case (if I’m reading the datasheet correctly, and it’s insane enough that I still have a small hope that it has just been badly translated). Peripheral access goes over some special bus, and accesses of different peripherals aren’t guaranteed to be in order. In the example, a_status might end up with peripheral b’s data.

    Data for a single peripheral stays in order, so as long as you stick with a single peripheral you’re OK. But access another, and you must have a barrier. And hope that the person who wrote any interrupt handlers that might be active at the time put barriers in too.

  7. Steve - May 29th, 2018 7:49 am

    This out-of-order GPIO stuff for RPi is fascinating and bewildering, and I’ve been reading more about it. Thanks to asdf and John May as well. Some tentative conclusions:

    1. The GPIO memory is not cached (at least not for reads). If it were, correct use of the GPIO pins would be impossible. All subsequent reads of a GPIO pin’s state would return the cached value from the first read, instead of the current pin state.

    2. The potential for out-of-order reads getting assigned to the wrong registers is real (the a_status and b_status example above), but exists only when reading from two different peripherals (like GPIO and a hardware Timer/Counter or UART). GPIO is a single peripheral, so out-of-order reads aren’t a problem and memory barriers aren’t generally necessary for code that only uses GPIO. Similarly, out-of-order writes aren’t a problem either when writing strictly to GPIO.

    My mental model (which could be totally wrong) is that each peripheral maintains a FIFO of read requests and write requests from the ARM core, and the FIFO entries are always handled in order. But the different peripherals have their own independent FIFOs, and there’s no guarantee of ordering across them. The ARM core might issue a read request to peripheral A, then a different read request to peripheral B, and eventually a result will be returned to the core, but it won’t know if the result came from A or B.

  8. Loïc - June 2nd, 2018 6:59 am

    A good intro to memory barriers and other troubles with compiler and concurrency is the following paper:
    What every systems programmer should know about concurrency, Matt Kline
    https://bitbashing.io/concurrency-primer.html

  9. Steve - June 2nd, 2018 11:10 am

    I must admit I still don’t understand it. That paper talks almost exclusively about problems arising from multi-threaded code. What I don’t understand how is such concurrency problems can occur in a single thread, running on a single core. Section 7 of the paper does mention “weakly ordered hardware” but does nothing to explain it at the hardware level, or to describe what problems might arise if memory barriers aren’t used.

    I can understand that if I write some single-threaded code like:

    int foo = *pFoo;
    int bar = *pBar;

    those two assignments might not happen in program order. Bar might actually get loaded from memory before Foo, but that doesn’t really matter. The same goes for writes, where Bar might be written before Foo, but single-threaded code doesn’t care.

    What I can’t understand is how the above code could produce the *wrong result*, with the value at pBar somehow getting assigned to the variable Foo. That’s what the Broadcom datasheet seems to say could happen, but I can’t visualize what kind of hardware design would make that possible. It just seems completely broken. Programming in such an environment seems like it would be virtually impossible, without having to surround every single statement with a memory barrier.

  10. Steven Clark - June 3rd, 2018 8:18 pm

    If you really need speed and access to memory mapped IO registers you should probably just build an upper portion of your application into a kernel module.
    mach/platform.h provides the GPIO_BASE address for whichever version of the SOC your using. And the high resolution timer interface should let you get by without figuring out how the interrupt vector’s being used for some applications. It’s certainly better than running on busy loops or jiffy-precision timers (unless you can get a DMA system driving your peripheral registers, I haven’t yet, maybe in the future)

    linux/miscdevice.h makes it easy to get one or more character devices in /dev

    Breaking the whole permission system to give a user space process access to all of memory kinda voids the whole idea of having an OS or security of any sort.

  11. Plank György - October 12th, 2018 6:17 am

    Last sentence at the end of the 7. page of the Broadcom datasheet:
    \”The only time write data can arrive out-of-order is if two different peripherals are connected to the same external equipment.\”

    That means to me that the case should be something like reading a sensor both by GPIO and USB. For me it doesn\’t look like a risk I wouldn\’t deal with in a non-industrial project.

  12. Jakob Stark - December 26th, 2018 5:25 am

    All these libraries (except wiringpi in the sys mode) use direct access to the memory mapped hardware registers of the soc, form USERSPACE. Thats not how things normally should work. The reason an operating system has got a kernel is to hide hardware specific details from userspace and provide (via device drivers) an api for programs and libraries running in userspace. While all the three libraries mentioned above may work, I would strongly recommend to use a gpio device driver (probably with a library). Fortunately there is a gpio driver for the BCM2835 in the Linux Kernel. The corresponding character device is located at /dev/gpiochip0. There is also a user space library in the debian package sources named libgpiod. This library simplifies the use of /dev/gpiochip0 and has a doxygen documentation available. The biggest advantage over the libraries mentioned above is, that it is platform and device independent. No matter if you use gpio pins of the bcm2385 or another gpio controller wired to the pi via e.g. i2c, as long as there is a device driver for your gpio chip it works aut of the box.

  13. val - July 16th, 2019 1:27 pm

    Thank you for this excellent article. Got here trying to find something more efficient and thin to access RPi GPIO, more like among the lines you write your preferences are as well. However we often forget what was the main purpose of Raspberry Pi and why it is still left hanging in between worlds as a mainly educational/experimental/builder gadget only. If we want to experiment with it then everything goes. From direct access to Pi’s regs to canonical kernel drivers, everything is fine to play with as long as it fits your use case. I for one I would still go with bcm2835, even now in the days of libgpiod.

    As for memory reordering, things are not that bad. In a few words the compiler will take care of “most” of the things for you. Your single threaded binary will be guaranteed to run as you expect. But if we want more efficiency, multithreading and we get closer to the hardware we will have to deal with reordering, be that the one the CPU does at runtime or the one that the compiler do during code optimization.

  14. Stéphane - August 24th, 2019 9:57 am

    Thank you for this article, and the links provided by everyone. Very useful.

  15. Petros - August 20th, 2020 1:44 am

    Hey, nice review. I understand that I come here a little late, but I want to know if these libraries can be used with gpio connected to pc with usb.

    I purchased an adafruit adapter usb to gpio/spi and I don’t know how to use it. None of these libraries seem to work since the programs return with errors from initializing functions

  16. Jakob Stark - August 20th, 2020 2:43 am

    @Petros all of the libraries described above are specifically tailored to use the raspberry pi hardware. Also the raspberry pi gpios can be controlled directly through memory registers. On your PC this will not work for sure as you are on a totally different system and your gpio adapter is connected via usb. If you want to use a usb gpio adapter (i suppose on linux) you probably have to install a usb driver for your specific adapter. If you are lucky such a driver may already be enabled in your kernel. Then you can use a userspace library like e.g. libgpiod to control your gpios.

  17. salty - October 7th, 2021 4:50 am

    I found your analysis of Arduino / RPi to be interesting. I’m working on an application where timing in the double or even low triple digit ms range is sufficient, but loss of control is unacceptable. I’m occasionally seeing an Arduino go out to lunch and I’m concerned that due to it’s single threaded nature there’s no way to mitigate that risk. While the linux kernel and the baggage it brings certainly adds latency concerns, the ability to have threads running watchdog functions in my use case is leading me to use RPi instead of Arduino for this project.

  18. Steve - October 7th, 2021 8:00 am

    The original blog post is from 2018, so the RPi GPIO options have probably changed since then. Whatever tool gets the job done is good enough, whether that’s RPi or Arduino or STM32 or something else. But I’d be surprised if your application “going out to lunch” is caused by its single thread; it should be the opposite. On a bare-metal platform like Arduino there’s no other code running besides your own application, so if it becomes bogged down unresponsive then it must be due to an issue with your own code that you can troubleshoot and fix. Most microcontrollers also have a hardware watchdog peripheral that you can configure to reset the mcu if the code accidentally gets stuck in an infinite loop somewhere.

  19. Ian Bowden - February 9th, 2022 4:28 pm

    I just want to let you know that WiringPi is no more.
    I’ve used it in the past and found it to be a really well organized and adaptable interface. There were examples helping it’s use in C, Perl, Python and a host of other common languages.

    But the author found himself deluged with support requests from people who hadn’t taken the time to RTFM.

  20. Samuel A - March 8th, 2023 6:02 am

    I found a 100-line gist that’s faster and easier to use than all these options on average. It’s amazing how it’s even physically possible to overcomplicate 20 bits.

    https://gist.github.com/llbit/f030a6300cca746ef0777b31c6a231bc

    And half the file is a demo…

Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.