Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "rendering overview" document #501

Merged
merged 8 commits into from
Sep 24, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions src/Graphics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Graphics Overview

The Game Boy outputs graphics to a 160×144 pixel LCD, using a quite complex
mechanism to facilitate rendering.

::: warning Terminology

Sprites/graphics terminology can vary a lot among different platforms, consoles,
users and communities. You may be familiar with slightly different definitions.
Keep also in mind that some definitions refer to lower (hardware) tools
and some others to higher abstractions concepts.

:::

## Tiles

Similarly to other retro systems, pixels are not manipulated
individually, as this would be expensive CPU-wise. Instead, pixels are grouped
in 8×8 squares, called _tiles_ (or sometimes "patterns" or "characters"), often considered as
the base unit in Game Boy graphics.

A tile does not encode color information. Instead, a tile assigns a
_color ID_ to each of its pixels, ranging from 0 to 3. For this reason,
Game Boy graphics are also called _2bpp_ (2 bits per pixel). When a tile is used
in the Background or Window, these color IDs are associated with a _palette_. When
a tile is used in an object, the IDs 1 to 3 are associated with a palette, but
ID 0 means transparent.

## Palettes

A palette consists of an array of colors, 4 in the Game Boy's case.
Palettes are stored differently in monochrome and color versions of the console.

Modifying palettes enables graphical effects such as quickly flashing some graphics (damage,
invulnerability, thunderstorm, etc.), fading the screen, "palette swaps", and more.

## Layers

The Game Boy has three "layers", from back to front: the Background, the Window,
and the Objects. Some features and behaviors break this abstraction,
but it works for the most part.

### Background

The background is composed of a _tilemap_. A tilemap is a
large grid of tiles. However, tiles aren't directly written to tilemaps,
they merely contain references to the tiles.
This makes reusing tiles very cheap, both in CPU time and in
required memory space, and it is the main mechanism that helps work around the
paltry 8 KiB of video RAM.

The background can be made to scroll as a whole, writing to two
hardware registers. This makes scrolling very cheap.

### Window

The window is sort of a second background layer on top of the background.
It is fairly limited: it has no transparency, it's always a
rectangle and only the position of the top-left pixel can be controlled.

Possible usage include a fixed status bar in an otherwise scrolling game (e.g.
_Super Mario Land 2_).

### Objects

The background layer is useful for elements scrolling as a whole, but
it's impractical for objects that need to move separately, such as the player.

The _objects_ layer is designed to fill this gap: _objects_ are made of 1 or 2 stacked tiles (8×8 or 8×16 pixels)
and can be displayed anywhere on the screen.

::: tip NOTE

Several objects can be combined (they can be called _metasprites_) to draw
a larger graphical element, usually called "sprite". Originally, the term "sprites"
referred to fixed-sized objects composited together, by hardware, with a background.
Use of the term has since become more general.

:::

To summarise:

- **Tile**, an 8×8-pixel chunk of graphics.
- **Object**, an entry in object attribute memory, composed of 1 or 2
tiles. Can be moved independently of the background.
105 changes: 46 additions & 59 deletions src/Rendering.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,72 @@
# Rendering Overview
# Rendering overview

The Game Boy outputs graphics to a 160×144 pixel LCD, using a quite complex
mechanism to facilitate rendering.
## Terminology

::: warning Terminology
The entire frame is not drawn atomically; instead, the image is drawn by the **<abbr>PPU</abbr>** (Pixel-Processing Unit) progressively, **directly to the screen**.
A frame consists of 154 **scanlines**; during the first 144, the screen is drawn top to bottom, left to right.

Sprites/graphics terminology can vary a lot among different platforms, consoles,
users and communities. You may be familiar with slightly different definitions.
Keep also in mind that some definitions refer to lower (hardware) tools
and some others to higher abstractions concepts.
The main implication of this rendering process is the existence of **raster effects**: modifying some rendering parameters in the middle of rendering.
The most famous raster effect is modifying the [scrolling registers](<#LCD Position and Scrolling>) between scanlines to create a ["wavy" effect](https://gbdev.io/guides/deadcscroll#effects).

:::
A "**dot**" = one 2<sup>22</sup> Hz (≅ 4.194 MHz) time unit.
Dots remain the same regardless of whether the CPU is in [double speed](<#FF4D — KEY1 (CGB Mode only): Prepare speed switch>), so there are 4 dots per single-speed CPU cycle, and 2 per double-speed CPU cycle.

## Tiles
::: tip

Similarly to other retro systems, pixels are not manipulated
individually, as this would be expensive CPU-wise. Instead, pixels are grouped
in 8×8 squares, called _tiles_ (or sometimes "patterns" or "characters"), often considered as
the base unit in Game Boy graphics.
Note that a frame is not exactly one 60<sup>th</sup> of a second: the Game Boy runs slightly slower than 60 Hz, as one frame takes ~16.74 ms instead of ~16.67 (the error is 0.45%).

A tile does not encode color information. Instead, a tile assigns a
_color ID_ to each of its pixels, ranging from 0 to 3. For this reason,
Game Boy graphics are also called _2bpp_ (2 bits per pixel). When a tile is used
in the Background or Window, these color IDs are associated with a _palette_. When
a tile is used in an object, the IDs 1 to 3 are associated with a palette, but
ID 0 means transparent.
:::

## Palettes
## PPU modes

A palette consists of an array of colors, 4 in the Game Boy's case.
Palettes are stored differently in monochrome and color versions of the console.
<figure><figcaption>

Modifying palettes enables graphical effects such as quickly flashing some graphics (damage,
invulnerability, thunderstorm, etc.), fading the screen, "palette swaps", and more.
During a frame, the Game Boy's PPU cycles between four modes as follows:

## Layers
</figcaption>

The Game Boy has three "layers", from back to front: the Background, the Window,
and the Objects. Some features and behaviors break this abstraction,
but it works for the most part.
{{#include imgs/ppu_modes_timing.svg:2:}}

### Background
</figure>

The background is composed of a _tilemap_. A tilemap is a
large grid of tiles. However, tiles aren't directly written to tilemaps,
they merely contain references to the tiles.
This makes reusing tiles very cheap, both in CPU time and in
required memory space, and it is the main mechanism that helps work around the
paltry 8 KiB of video RAM.
While the PPU is accessing some video-related memory, [that memory is inaccessible to the CPU](<#Accessing VRAM and OAM>) (writes are ignored, and reads return garbage values, usually $FF).

The background can be made to scroll as a whole, writing to two
hardware registers. This makes scrolling very cheap.
Mode | Action | Duration | Accessible video memory
----:|--------------------------------------------|--------------------------------------|-------------------------
2 | Searching for OBJs which overlap this line | 80 dots | VRAM, CGB palettes
3 | Sending pixels to the LCD | Between 172 and 289 dots, see below | None
0 | Waiting until the end of the scanline | 376 - mode 3's duration | VRAM, OAM, CGB palettes
1 | Waiting until the next frame | 4560 dots (10 scanlines) | VRAM, OAM, CGB palettes

### Window
## Mode 3 length

The window is sort of a second background layer on top of the background.
It is fairly limited: it has no transparency, it's always a
rectangle and only the position of the top-left pixel can be controlled.
During Mode 3, by default the PPU outputs one pixel to the screen per dot, from left to right; the screen is 160 pixels wide, so the minimum Mode 3 length is 160 + 12[^first12] = 172 dots.

Possible usage include a fixed status bar in an otherwise scrolling game (e.g.
_Super Mario Land 2_).
Unlike most game consoles, the Game Boy does not always output pixels steadily[^crt]: some features cause the rendering process to stall for a couple dots.
Any extra time spent stalling *lengthens* Mode 3; but since scanlines last for a fixed number of dots, Mode 0 is therefore shortened by that same amount of time.

### Objects
Three things can cause Mode 3 "penalties":

The background layer is useful for elements scrolling as a whole, but
it's impractical for objects that need to move separately, such as the player.
- **Background scrolling**: At the very beginning of Mode 3, rendering is paused for [`SCX`](<#FF42–FF43 — SCY, SCX: Viewport Y position, X position>) % 8 dots while the same number of pixels are discarded from the leftmost tile.
- **Window**: After the last non-window pixel is emitted, a 6-dot penalty is incurred while the BG fetcher is being set up for the window.
- **Objects**: Each object drawn during the scanline (even partially) incurs a 6- to 11-dot penalty ([see below](<#OBJ penalty algorithm>)).

The _objects_ layer is designed to fill this gap: _objects_ are made of 1 or 2 stacked tiles (8×8 or 8×16 pixels)
and can be displayed anywhere on the screen.
On DMG and GBC in DMG mode, mid-scanline writes to [`BGP`](<#FF47 — BGP (Non-CGB Mode only): BG palette data>) allow observing this behavior precisely, as any delay shifts the write's effect to the left by that many dots.

::: tip NOTE
### OBJ penalty algorithm

Several objects can be combined (they can be called _metasprites_) to draw
a larger graphical element, usually called "sprite". Originally, the term "sprites"
referred to fixed-sized objects composited together, by hardware, with a background.
Use of the term has since become more general.
Only the OBJ's leftmost pixel matters here, transparent or not; it is designated as "The Pixel" in the following.

1. Determine the tile (background or window) that The Pixel is within. (This is affected by horizontal scrolling and/or the window!)
2. If that tile has **not** been considered by a previous OBJ yet:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the OBJ rendering order has been discussed at this point, so "previous OBJ" may be confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a footnote attached to "previous"?

Since pixels are emitted from left to right, OBJs overlapping the scanline are considered from leftmost <link to OAM page, "X position" anchor> to rightmost, with ties broken by the OAM order (lowest address first).

1. Count how many of that tile's pixels are to the right of The Pixel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording "to the right" sounds exclusive, but I believe the logic here is inclusive.

Copy link
Member Author

@ISSOtm ISSOtm Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subtraction below should have been adjusted to account for that. Is it correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not, because then the maximum is 7 pixels; minus 3 = 4; plus the "flat" 6 = 40, but the maximum stated above is 11. So probably that the line below should be "Subtract 2." instead?

2. Subtract 3.
3. Incur this many dots of penalty, or zero if negative (from waiting for the BG fetch to finish).
3. Incur a flat, 6-dot penalty (from fetching the OBJ's tile).

**Exception**: an OBJ with an OAM X position of 0 (thus, completely off the left side of the screen) always incurs a 11-cycle penalty, regardless of `SCX`.

:::

To summarise:
[^first12]: The 12 extra cycles come from two tile fetches at the beginning of Mode 3. One is the first tile in the scanline (the one that gets shifted by `SCX` % 8 pixels), the other is simply discarded.

- **Tile**, an 8×8-pixel chunk of graphics.
- **Object**, an entry in object attribute memory, composed of 1 or 2
tiles. Can be moved independently of the background.
[^crt]: The Game Boy can afford to "take pauses", because it writes to a LCD it fully controls; by contrast, home consoles like the NES or SNES are on a schedule imposed by the screen they are hooked up to. Taking pauses arguably simplified the PPU's design while allowing greater flexibility to game developers.
57 changes: 2 additions & 55 deletions src/STAT.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,67 +33,14 @@ Bit 1-0 - Mode Flag (Mode 0-3, see below) (Read Only)
3: Transferring Data to LCD Controller
```

The two lower STAT bits show the current status of the PPU.
The two lower STAT bits show the current [status of the PPU](<#STAT modes>).

Bit 2 is set when [LY](<#FF44 — LY: LCD Y coordinate \[read-only\]>) contains the same value as [LYC](<#FF45 — LYC: LY compare>).
It is constantly updated.

Bits 3-6 select which sources are used for [the STAT interrupt](<#INT $48 — STAT interrupt>).

## STAT modes

The LCD controller operates on a 2^22 Hz = 4.194 MHz dot clock. An
entire frame is 154 scanlines = 70224 dots = 16.74 ms. On scanlines 0
through 143, the PPU cycles through modes 2, 3, and 0 once
every 456 dots. Scanlines 144 through 153 are mode 1.

The following sequence is typical when the display is enabled:

```
Mode 2 2_____2_____2_____2_____2_____2___________________2____
Mode 3 _33____33____33____33____33____33__________________3___
Mode 0 ___000___000___000___000___000___000________________000
Mode 1 ____________________________________11111111111111_____
```

When the PPU is accessing some video-related memory, that memory is inaccessible
to the CPU: writes are ignored, and reads return garbage values (usually $FF).

- During modes 2 and 3, the CPU cannot access [OAM](<#VRAM Sprite Attribute Table (OAM)>) ($FE00-FE9F).
- During mode 3, the CPU cannot access VRAM or [CGB palette data registers](<#LCD Color Palettes (CGB only)>)
($FF69,$FF6B).

Mode | Action | Duration | Accessible video memory
-----|------------------------------------------------------------------|--------------------------------------------------------------------|-------------------------
2 | Searching OAM for OBJs whose Y coordinate overlap this line | 80 dots | VRAM, CGB palettes
3 | Reading OAM and VRAM to generate the picture | 168 to 291 dots, depending on object count | None
0 | Nothing (HBlank) | 85 to 208 dots, depending on previous mode 3 duration | VRAM, OAM, CGB palettes
1 | Nothing (VBlank) | 4560 dots (10 scanlines) | VRAM, OAM, CGB palettes

## Properties of STAT modes

Unlike most game consoles, the Game Boy can pause the dot clock briefly,
making Mode 3 longer and Mode 0 shorter. It routinely takes a 6 to 11 dot
break to fetch an OBJ's tile between background tile pattern fetches.
On DMG and GBC in DMG mode, mid-scanline writes to [`BGP`](<#FF47 — BGP (Non-CGB Mode only): BG palette data>)
allow observing this behavior, as the delay from drawing an OBJ shifts the
write's effect to the left by that many dots.

Three things are known to pause the dot clock:

- Background scrolling: If `SCX % 8` is not zero at the start of the scanline, rendering is paused for that many dots while the shifter discards that many pixels from the leftmost tile.
- Window: An active window pauses for at least 6 dots, as the background fetching mechanism starts over at the left side of the window.
- Objects: Each object usually pauses for `11 - min(5, (x + SCX) % 8)` dots.
Because object fetch waits for background fetch to finish, an object's cost depends on its position relative to the left side of the background tile under it. It's greater if an object is directly aligned over the background tile, less if the object is to the right. If the object's left side is over the window, use `255 - WX` instead of `SCX` in this formula.

::: warning TO BE VERIFIED

The exact pause duration for window start is
not confirmed; it may have the same background fetch finish delay as
an object. If two objects' left sides are over the same background or
window tile, the second may pause for fewer dots.

:::
### Spurious STAT interrupts

A hardware quirk in the monochrome Game Boy makes the LCD interrupt
sometimes trigger when writing to STAT (including writing \$00) during
Expand Down
5 changes: 3 additions & 2 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# I/O Ports

- [Summary](./Hardware_Reg_List.md)
- [Rendering](./Rendering.md)
- [Graphics](./Graphics.md)
- [Tile Data](./Tile_Data.md)
- [Tile Maps](./Tile_Maps.md)
- [OAM](./OAM.md)
Expand All @@ -24,7 +24,8 @@
- [LCD Status Registers](./STAT.md)
- [Scrolling](./Scrolling.md)
- [Palettes](./Palettes.md)
- [Pixel FIFO](./pixel_fifo.md)
- [Rendering](./Rendering.md)
- [Pixel FIFO](./pixel_fifo.md)
- [Audio](./Audio.md)
- [Audio Registers](./Audio_Registers.md)
- [Audio Details](./Audio_details.md)
Expand Down
11 changes: 0 additions & 11 deletions src/pixel_fifo.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Pixel FIFO

::: tip TERMINOLOGY

All references to a dot are meant as dots (4.19 MHz). Dots remain the same regardless of
CGB double speed.
When it is stated that a certain action *lengthens mode 3* it means that mode 0 (HBlank) is
shortened to make up for the additional time in mode 3, as shown in the following diagram.

:::

{{#include imgs/ppu_modes_timing.svg:2:}}

## Introduction

FIFO stands for *First In, First Out*. The first pixel to be pushed to the
Expand Down