Archive for the '3D Graphics Thingy' Category
An old project of mine may return from the dead. A few years ago, I started an ambitious project to build a 3D graphics processor using an FPGA. The goal was to create a simple version of the GPU you might find in your PC or video game console – something able to work in tandem with a standard CPU, handling the 3D transform math and pixel rasterization needed to draw pictures of spinning cubes and teapots and things. At the time I was doing a lot of 3D software programming for my job as a game developer, so the idea of building 3D hardware was exciting.
Unfortunately, 3DGT quickly turned into How to Shrink Memory Bandwidth Requirements, How to Build a High Performance DDR2 Memory Controller, and then How to Debug the Xilinx Development Tools, none of which were any fun. I eventually gave up, without ever getting to any of the interesting graphics-related stuff.
Yesterday I happened to re-read my summary of the project, which concluded “Lesson learned: start with a small project and add to it incrementally, instead of beginning with grandiose plans.” That started me thinking about what kind of small and simple graphics system I could build quickly, in order to get something working I could iterate on and improve. Almost all my difficulties with 3DGT were related to DDR2 RAM and the RAM bandwidth requirements, so if I could avoid those problems, I’d be in good shape. The solution seemed simple: use standard SRAM, and shrink the frame buffer size and color bits per pixel, until the memory bandwidth requirements are reduced to an acceptable level.
For this “Son of 3D Graphics Thingy”, I’m envisioning something like this:
- 512KB of SRAM used for video memory, with a 16-bit wide memory interface
- 20 MHz system clock rate
- 8 bits per pixel, indexed color
- One 640 x 480 frame buffer, or two 546 x 480 frame buffers with double-buffering
- VGA output
- No depth buffer (Z buffer)
- Rasterization only; no 3D transform math
This is a much more modest goal than the original 3D Graphics Thingy. Without a depth buffer or 3D transform support, it’s really more of a 2D triangle rasterizer coprocessor than a 3D GPU. The CPU will be responsible for doing the 3D matrix transformations in software, and drawing objects in back-to-front order to ensure proper depth sorting. It won’t compete with the GeForce, but if I recall correctly, it’s very similar to how the original 1995 Playstation worked.
A 16-bit memory interface running at 20 MHz has a max theoretical throughput of 40 MB/s. So what can we do with that? Let’s assume each pixel is cleared to black at the start of each video frame. Then the pixel is written to four times, by four overlapping triangles (the scene’s depth complexity is 4). Finally the pixel is read by the VGA circuit, to generate the video signal output. That’s 6 memory operations per pixel, at 8 bits per pixel (one byte), so 6 bytes per pixel per frame.
Assuming a 640 x 480 frame buffer, each frame will involve 640 x 480 x 6 = 1.84 MB of memory I/O. Dividing that into the 40 MB/s of available video memory bandwidth results in a top speed of 40 / 1.84 = 22.8 frames per second. With only a single buffer, you’ll see screen tearing while objects are animating, which isn’t ideal, but it works.
Plan B is to use two 546 x 480 buffers, and draw objects into one buffer while the VGA circuit generates a video signal from the other buffer. This rather strange frame buffer size was chosen because two buffers fit exactly into 512 KB. Probably the VGA circuit will add black bars on the left and right of the image, pillar boxing the 546 x 480 image inside a standard 640 x 480 video signal. With a 546 x 480 frame buffer, each frame will involve 546 x 480 x 6 = 1.57 MB of memory I/O, resulting in a top speed of 40 / 1.57 = 26.7 frames per second. 26.7 FPS isn’t exactly speedy, but it’s fast enough to draw animated objects. And thanks to double-buffering, you won’t see any screen tearing.
Now I need to design a board with a CPU, an FPGA, and some SRAM, right? Actually, no I don’t. The Altera DE1 board that I used for Plus Too already has everything I need. Its FPGA is large enough to implement both a soft-core CPU and the custom graphics core, and it’s got 512 KB of SRAM with a 16-bit wide interface. The SRAM has a 10 ns access time, so better performance than I described above is possible if I can boost the clock speed above 20 MHz. And the board also has 8 MB of SDRAM, if I ever get brave enough to make another attempt at writing a memory controller. It looks like other people already have working examples of SDRAM controllers for the DE1, so maybe it wouldn’t be that bad.
So that’s the plan. I’m not expecting to start building this tomorrow – I still have my Nibbler CPU project to finish, and other projects I’d like to pursue – but it’s an interesting idea. My problem is too many ideas, too little time!
Read 5 comments and join the conversation
Maybe 3D Graphics Thingy isn’t dead after all. One of my goals for the creation of Tiny CPU was to get more comfortable with programmable logic design, and custom circuit board development, and I’ve certainly done that. Once Tiny CPU is done, I think I’m going to revisit graphics in the form of “2D Graphics Thingy”. My plan would be to work through the memory interface problems that stumped me the first time, and create something like a programmable 2D blitter for a VGA frame buffer. That seems like a manageable project I could expect to succeed at, while also being a step along the way toward the ultimate goal of 3DGT.Read 5 comments and join the conversation
OK, it’s time to admit defeat. 3D Graphics Thingy is not going to happen. It’s been six months since I worked on it. Heck, I even let my web hosting account expire due to neglect.
So what happened? I ran hard into the memory interface wall. Getting a decent DRAM controller working proved to be far, far more difficult than I’d expected, even with the assistance of Xilinx wizards and prebuilt controller packages. And since getting a working memory interface is a precondition to actually doing any of the 3D stuff, well, that sure put a damper on things.
A second reason for failure is that I found working with FPGAs to be abstract and unsatisfying, and the tool software to be a nightmare. When I built BMOW, I was constantly wiring things, debugging with the oscilloscope, buying new chips, soldering switches, and generally being hands-on. In contrast, 3DGT development ended up being nothing but writing Verilog in a text editor, and wondering why the Xilinx synthesis tools never did what I expected them to. The FPGA hardware itself just sat, untouched.
So what’s next? Since last summer, I haven’t done any electronics work at all, except building a light saber from a string of Christmas lights and a flourescent tube cover. I’ve gotten pretty involved in remote control vehicles, primarily RC planes, which give a few excuses to solder and build simple circuits. I have half an idea to use an Arduino with my Slow Stick somehow, to collect acceleration data in flight, or automate aerial photography or something.
Maybe I’ll come back to the CPU design thing again at some point. I still have a 68008 and some other parts I bought last year that I never got to use, so those are still waiting for me. For all those who contacted me asking if they could build something like BMOW or 3DGT, or asking for advice, send me a note and let me know how your projects are progressing now.
Happy hacking wishes to you all!Read 8 comments and join the conversation
I think I’m making life more difficult than it needs to be, trying to get this DDR2 SDRAM interface to work. It’s not that the logical interface is so complicated, really… you set your row and column addresses, do a burst transaction, check for refresh… not trivial, but not rocket science either. And the Xilinx MIG or other vendor-specific wizard will generate a memory interface for you to use as a starting point.No, what seems to be difficult is that the margin for error with DDR2 SDRAM is much smaller than with SRAM or plain (single data rate) SDRAM. The voltages are lower, the timing tolerances are tighter, and much more care must be given to compensating for things like possible skew, processes variation between different FPGAs, power supply tolerances, and a host of other worries.I’ve been reading a LOT on this topic in the past couple of weeks, and I’ve been struck by one thing. Except for my Xilinx Spartan 3A starter board, and Altera’s comperable Cyclone III board, I’ve seen zero boards that use DDR or DDR2 memory. The all use plain SDR SDRAM, also known as PC100 or PC133 depending on the speed. I looked at boards in the $150 to $300 range from Opal Kelly, KNJN, XESS, and others, and they all use plain SDR SDRAM. Maybe I should take a hint?Meanwhile, I’ve been digesting as much FPGA documentation as I can. So far I’ve chewed through about 1500 pages of the Xilinx MIG user manual, Spartan 3 series user manual, and Spartan 3A addendum, and I’m midway through the comprehensive book FPGA Prototyping by Verilog Examples: Xilinx Spartan-3 Version. It’s the best “getting started” reference I’ve seen yet, with good coverage of Verilog, FPGA hardware, and the Xilinx software tools.Read 10 comments and join the conversation
Finally, some small progress on the memory interface. After banging my head every which way against the Xilinx tools, and reading everything I could find on the subject, I came across Leo Silvestri’s page on modifying the Xilinx MIG memory controller design for a Spartan 3E board. It’s for a different kit and an older version of the software, but with his help I was finally able to build the reference design and testbench for the Spartan 3A board, program it to the FPGA, and see the LED that indicates success. It’s not very exciting, but it’s progress.I still can’t believe all the steps I went through, and the whole process has made me quite bitter about Xilinx’s software tools. I’m sure it would be easier if I had better general knowledge of this field, but the last few weeks of this project have been like being lost at sea, and totally disoriented. It still feels more like a series of disconnected guesses than a genuine understanding, but here’s what I’ve managed to piece together on the topic of using the DDR2 SDRAM that’s on the Spartan 3A kit board.
- The Xilinx MIG can’t be used to generate a new memory controller design for the Spartan 3A board. This is because the way the SDRAM on the board is connected to the FPGA pins violates some of the MIG design rules. The only solution is to use the pre-built Spartan 3A board reference controller design, which then locks you into a specific burst length and CAS latency, or to hand-modify the code generated by the MIG, which is way beyond the skills of a noob like me.
- Using the newest version of the Xilinx ISE and MIG, attempting to add the Spartan 3A reference design to your project will cause a crash. No answer from Xilinx support on this.
- You can also get the Spartan 3A reference design as a zip file. But if you unzip it, add all the files to a new ISE project, and try to build it, you’ll get lots of errors about non-existant nets that I couldn’t resolve.
- There’s also a batch file in the zip file that will create a new ISE project for you. But try to build it, and you’ll be told that the design requires a ChipScopePro license, which is Xilinx’s software logic analyzer. I found a discussion of this on the Xilinx forums, but no resolution other than to create a new controller design that omits ChipScopePro support, which is impossible for this board due to issue number 1 above.
- What finally worked was to hand-edit the reference design, deleting parts of it semi-randomly until the ChipScopePro error disappeared. It turned out that required removing three modules called icon, ila, and vio, none of which seemed obviously related to debugging to me.
So there you have it. The next step will be to begin to actually use this interface for something more interesting than lighting up an LED. I’m just now realizing that the interface created by the MIG is just the first, small step towards what the 3DGT memory controller must eventually become. It’s not enough to simply have an interface that permits reading and writing. To achieve half-way decent performance, much care will be required to manage and coordinate those reads and writes, minimizing waiting and wasted time, and maximizing throughput. And to top it off, it’s going to need a bus master to arbitrate memory access between the display circuit, pixel processors, vertex processors, and any other consumers of memory. All this is a substantial project in itself, that will need to be at least partially completed before any real progress can begin on the 3D part of 3DGT. Looks like a long, slow climb, but I’m moving ahead.Read 4 comments and join the conversation
I think I’m about ready to crush this Xilinx starter kit under my boot, and use the pulverized component dust to scrub my toilet. That’s not quite fair, though, as my frustration isn’t really with the hardware, but with the inexplicable Xilinx software. At this point, I’ve spent about 20 hours over a couple of weeks, just trying to instantiate the sample Xilinx SDRAM memory controller. I’m amazed that something so central to the use of a Xilinx FPGA or starter kit could be so obtuse. Or maybe it’s me that’s obtuse, but regardless, I was never so exasperated in all the time I was working on BMOW. Back then, at least each piece of hardware was small and understandable, and any errors were of my own making. Now I’m spending hour upon hour attempting to decode the error messages from Xilinx’s software, and trying to guess at how they intended this process to work. I expected something like:
- Create new project
- Run “memory interface generator” wizard (which Xilinx calls the M.I.G.)
- Choose memory type, speed, etc.
- The wizard adds some auto-generated .v and .ucf (user constraints) files to my project
- Optionally, wizard also adds a test bench, or some kind of example
- Synthesize the example, program it to the starter kit, and blink some LEDs to show that it worked.
That was the theory anyway. The reality has been a long series of software errors and omissions too dull to recount in detail. The short version is that when I use the MIG to generate an interface specifically for the Spartan 3A starter kit, the MIG crashes. If I follow some hazy instructions for manually adding the reference design to the project without using the MIG, then I get something that fails the “translate” step. If I use the MIG to generate a new interface design for a board that just happens to have the same hardware as the Spartan 3A starter kit, I also get something that fails the “translate” step. In either case, before the fatal errors, there are many warnings saying that dozens of flip-flops were determined to have a constant 0 or 1 value, and so were optimized away, as well as copious other warnings. Clearly I’m doing something very wrong, but creating a sample design using the reference memory interface on the reference board seems like it should be about as simple a case as it’s possible to get.
I would have given up on it a while ago, except that with no memory interface, there can be no 3D Graphics Thingy. This simply must be made to work in order for the project to progress any further. Unfortunately I’m about out of ideas. I need to find a simple walk-through tutorial that starts with “open ISE, press the New Project button” and finishes with happy green checkmarks next to all the steps in the processes window for an example design using the MIG controller. There are only about 10 mouse clicks needed between that start and finish, so it would seem hard to mess it up much. Either I’m doing something basic wrong, or omitting something, or my computer is haunted. With luck, it will become clear tomorrow.Read 8 comments and join the conversation