```articles/(re)sources: Start preformatted blocks on the same line.

To avoid an extra line of white space.```
```15 files changed, 86 insertions(+), 169 deletions(-)

M articles/fast_loops.php
M articles/interrupts.php
M articles/keymatrix.php
M articles/mult_div_shifts.php
M articles/split_guide.php
M articles/taste_of_vhdl.php
M articles/vdp_commands_speed.php
M articles/vdp_guide.php
M articles/vdp_tut.php
M resources/dos2_environment.php
M resources/msx_io_ports.php
M sources/docopy.php
M sources/raminpage1.php
M sources/vdp_rout.php
```
`M articles/fast_loops.php +9 -18`
```@@ 33,8 33,7 @@ of different kinds of loops in assembly.

<p>Often 16-bit loops are done like this:</p>

-<pre>
-    ld de,nnnn
+<pre>    ld de,nnnn
Loop:
; ... do something here
dec de

@@ 57,8 56,7 @@ higher 8 bits) and an LSB (the lower 8 b
amount of times given by the LSB, and when B reaches 0 you do it again MSB
times, each of which will loop it 256 more times. An example:</p>

-<pre>
-    ld b,10         ; The LSB of the loop is 10
+<pre>    ld b,10         ; The LSB of the loop is 10
ld d,3          ; The MSB of the loop + the first loop is 3
Loop:
; ... do something here

@@ 104,8 102,7 @@ fastest way to do go about it. In stead,
which takes the counter value in DE and puts the resulting separated counters in
D and B:</p>

-<pre>
-    ld b,e          ; Mystery fast loop calculus
+<pre>    ld b,e          ; Mystery fast loop calculus
dec de
inc d
</pre>

@@ 115,8 112,7 @@ slow 16-bit loop you’ll regain those after a loop or two.</p>

<p>So, to summarize, a full-fledged fast 16-bit loop looks like this:</p>

-<pre>
-    ld b,e          ; Number of loops is in DE
+<pre>    ld b,e          ; Number of loops is in DE
dec de          ; Calculate DB value (destroys B, D and E)
inc d
Loop:

@@ 134,8 130,7 @@ the OTIR instruction. With this instruct
to output. This is for example used in routines which execute VDP commands,
where the part which sends the command usually looks like this:</p>

-<pre>
-    ld hl,command   ; the address where the VDP command is stored
+<pre>    ld hl,command   ; the address where the VDP command is stored
ld c,9BH        ; the VDP port to write to
ld b,15         ; the number of loops (yes, ld bc,0F9BH is faster)
otir

@@ 150,8 145,7 @@ above using 15 OUTIs, it saves you 5 T-s
one). That adds up to a grand total of 70 T-states out of 340, about 21% faster.
Here’s how that looks:</p>

-<pre>
-    ld hl,command
+<pre>    ld hl,command
ld c,9BH
outi            ; 15x OUTI
outi

@@ 173,8 167,7 @@ Here’s how that looks:</p>
<p>To make this look a little more compact, many assemblers supports a repeat
directive, something like:</p>

-<pre>
-    REPEAT 15
+<pre>    REPEAT 15
outi
ENDR
</pre>

@@ 197,8 190,7 @@ use an LDI that number of times. That wo
can do instead is to unroll only part of the loop. Say, we need to LDIR
something 256 (100H) times. Instead of LDIR we could write:</p>

-<pre>
-    ld bc,256
+<pre>    ld bc,256
Loop:
ldi  ; 16x LDI
ldi

@@ 241,8 233,7 @@ that isn’t too hard, it can be handled using a few compares.</p>
based on the count modulo the number of unrolled instructions. This is put into
practice in the following example:</p>

-<pre>
-;
+<pre>;
; Up to 19% faster alternative for large LDIRs (break-even at 21 loops)
; hl = source (“home location”)
; de = destination

```
`M articles/interrupts.php +1 -2`
```@@ 312,8 312,7 @@ Issues with compatibility and ever-spinn
<div class="box">
<p><strong>Example program application of Interrupt Mode 2</strong></p>

-<pre>
-;
+<pre>;
; im2.gen - RWi
;
; Example of how to use Interrupt Mode 2

```
`M articles/keymatrix.php +3 -6`
```@@ 156,8 156,7 @@
</tr>
</table>

-<pre>
-    in a,(#AA)
+<pre>    in a,(#AA)
and #F0         ; only change bits 0-3
or b            ; take row number from B
out (#AA),a

@@ 169,8 168,7 @@

<p>In the following example I will show you how to read out the space key its status. The space key is located in bit 0 of row 8:</p>

-<pre>
-keys: EQU #FBE5
+<pre>keys: EQU #FBE5

;
; Check whether space is pressed

<p>Finally, a small Basic program which you can use for keyboard matrix testing purposes:</p>

-<pre>
-10 DEFINT A-Z:K=&amp;HFBE5:CLS
+<pre>10 DEFINT A-Z:K=&amp;HFBE5:CLS
20 FOR I=0 TO 10:PRINT RIGHT\$("0000000"+BIN\$(PEEK(K+I)),8):NEXT
30 PRINT CHR\$(11):GOTO 20
</pre>

```
`M articles/mult_div_shifts.php +19 -38`
```@@ 40,16 40,14 @@

<p>When you shift a register 1 bit to the left, you multiply the value of the register with 2. This shifting can be done using the SLA r instruction. By doing several shifts in sequence you can very easily multiply by any power of 2. For example:</p>

-<pre>
-    ld b,3          ; Multiply 3 with 4
+<pre>    ld b,3          ; Multiply 3 with 4
sla b           ; x4
sla b           ; result: b = 12
</pre>

<p>If you use register A you can multiply faster by using the ADD A,A instruction, which is 5 T-states per instruction instead of 8. So ADD A,A is exactly the same as SLA A, or a multiplication by two. On a sidenote, instead of using ADD A,A, you can also use RLCA, which effectively behaves the same.</p>

-<pre>
-    ld a,15         ; Multiply 15 with 8
+<pre>    ld a,15         ; Multiply 15 with 8
add a,a         ; result: a = 120

@@ 59,8 57,7 @@

<p>If you want to multiply by another value than a power of two, you can almost always achieve the desired result by storing inbetween values during the shifting and adding or subtracting them up afterwards. A few examples:</p>

-<pre>
-    ld a,5          ; Multiply 5 with 20 (= A x 16 + A x 4)
+<pre>    ld a,5          ; Multiply 5 with 20 (= A x 16 + A x 4)
ld b,a          ; Store value of A x 4 in B

@@ 73,8 70,7 @@

<p>Sometimes, you can also use substractions to achieve your goals faster. For example, the multiplication of A with 15. This can be done by using the method described above, however, in that case you’ll need 4 temporary registers, and four additional adds afterwards. This could better be done as follows, which requires 1 more multiplication but only uses 1 temporary register and 1 subtraction afterwards:</p>

-<pre>
-    ld a,3          ; Multiply 3 with 15 (= A x 16 - A x 1)
+<pre>    ld a,3          ; Multiply 3 with 15 (= A x 16 - A x 1)
ld b,a          ; Store value of A x 1 in B

@@ 88,16 84,14 @@

<p>Now, divisions are very much like multiplications. If a multiplication routine is complex, a division routine is even more so. However, using shifts it’s all too easy to divide in Assembly. It is done by simply shifting the other way, to the right. For this, you should use the SRL r instruction. An example:</p>

-<pre>
-    ld b,3          ; Divide 18 by 4
+<pre>    ld b,3          ; Divide 18 by 4
srl b           ; x4
srl b           ; result: b = 4 (rest 2 is lost)
</pre>

<p>There is no real fast alternative to a shift right when using register A. As an alternative, you can use RRCA for that. However, when using RRCA you must make sure there will be no rest, otherwise the result won’t be correct. This can be achieved by ANDing the original value with a value clearing the lower bits (which would otherwise be rotated out), or by making sure you are using values which are always a multiple of the divider.</p>

-<pre>
-    ld a,153        ; Divide 153 by 8
+<pre>    ld a,153        ; Divide 153 by 8
and a,%11111000 ; Clear bits 0-2 (equals 256 - 8)
rrca            ; /8
rrca

@@ 113,22 107,19 @@

<p>There are also ways to shift 16-bit registers. This is done using a 8-bit shift in combination with an 8-bit rotate and the carry bit. To shift a register DE one bit to the left you should use:</p>

-<pre>
-    sla e
+<pre>    sla e
rl d
</pre>

<p>To shift it one bit to the right (now with BC as example), use:</p>

-<pre>
-    srl b
+<pre>    srl b
rr c
</pre>

<p>Unfortunately, generally those 16-bit shifts are rather slow compared to 8-bit shifts (which will most often take place in the fast A register, which makes them almost 4 times as fast). However, just as with the 8-bit shifts, there is also the possibility to do faster 16-bit shifts to the left using the ADD instruction.</p>

-<pre>
-    add hl,hl       ; shift HL 1 bit left... hl = hl x 2
+<pre>    add hl,hl       ; shift HL 1 bit left... hl = hl x 2
</pre>

<p>So, the best way to multiply 16-bit values is by using register HL in combination with ADD HL,HL instructions.</p>

@@ 153,8 144,7 @@

<p>These are left-rotating multiplication routines. Their speed is basically quite constant, although depending on the number of 1’s in the primary multiplier there may be a slight difference in speed (a 1 usually takes 7 states longer to process than a 0).</p>

-<pre>
-;
+<pre>;
; Multiply 8-bit values
; In:  Multiply H with E
; Out: HL = result

ret
</pre>

-<pre>
-;
+<pre>;
; Multiply 8-bit value with a 16-bit value
; In: Multiply A with DE
; Out: HL = result

ret
</pre>

-<pre>
-;
+<pre>;
; Multiply 16-bit values (with 16-bit result)
; In: Multiply BC with DE
; Out: HL = result

ret
</pre>

-<pre>
-;
+<pre>;
; Multiply 16-bit values (with 32-bit result)
; In: Multiply BC with DE
; Out: BCHL = result

@@ 278,8 265,7 @@ 8-bit: 509 T-states<br />

<p>Well, here is the actual routine. Note that it can easily be converted to a Mult8R-routine by inserting an ld d,0 instruction right after the ld hl,0, and also that unrolling this routine will not give you additional speed. Oh, and by the way, although I’ve been talking about right-rotating all the time, this implementation is actually right-shifting ^_^.</p>

-<pre>
-;
+<pre>;
; Multiply 8-bit value with a 16-bit value (right rotating)
; In: Multiply A with DE
;      Put lowest value in A for most efficient calculation

<p>By the way, I looked at the possibility to use the same technique as with the right-rotating multiplication on this routine, it can fairly easy be done by putting jumps inbetween which jump into a list of add hl,hl’s. However, loss in speed the additional jumps cause don’t outweigh the gain, and it is hardly practical anyway since it would apply to the nr. of bits used from the left (the number 128 would use 1 bit, and 64 two).</p>

-<pre>
-;
+<pre>;
; Multiply 8-bit value with a 16-bit value (unrolled)
; In: Multiply A with DE
; Out: HL = result

@@ 376,8 361,7 @@ Output: #528 (#5.28 or 5,15625 decimal)<

<p>Anyways. These are the general division routines. Slower than the multiplication routines but still as fast as possible, and probably very useful.</p>

-<pre>
-;
+<pre>;
; Divide 8-bit values
; In: Divide E by divider C
; Out: A = result, B = rest

ret
</pre>

-<pre>
-;
+<pre>;
; Divide 16-bit values (with 16-bit result)
; In: Divide BC by divider DE
; Out: BC = result, HL = rest

<p>Ricardo Bittencourt provided us with a fast routine for division by 9. It is built for
.dsk or Disk ROM routines. It’s very fast, but only works in the range 0-1440.</p>

-<pre>
-;
+<pre>;
; division by nine
; enter     HL = number from 0 to 1440
; exit      A = HL/9

@@ 484,8 466,7 @@ DIV9:   INC     HL              ;  7

<p>This is a faster square root routine than the one that was previously here, test results say that it’s 26% faster. It is written by Ricardo Bittencourt, so many thanks to him :).</p>

-<pre>
-;
+<pre>;
; Square root of 16-bit value
; In:  HL = value
; Out:  D = result (rounded down)

```
`M articles/split_guide.php +3 -6`
```@@ 73,8 73,7 @@ A: I tested the statements made in this

<p>Example code to poll for the end of HBLANK, assumes disabled interrupts and s#2 to be set:</p>

-<pre>
-Poll_1:
+<pre>Poll_1:
in a,(#99)      ; wait until start of HBLANK
and %00100000
jp nz,Poll_1

@@ 96,8 95,7 @@ A: The VDP lineinterrupt is linked to the FH bit, and the interrupt occurs when the FH bit is set. The FH bit will be set at the <em>exact</em> beginning of the line in r#19 + 1, so that’s the line <em>after</em> the line set. If register 19 contains the value 99, the lineinterrupt will occur at the utter left of line 100, inside the left border (which is about halfway the horizontal blanking period).</p>

<p>Example code for polling FH, assumes disabled interrupts and s#1 to be set:</p>

-<pre>
-    ld a,(VDP+23)   ; set split line
+<pre>    ld a,(VDP+23)   ; set split line
out (#99),a
ld a,19+128

@@ 135,8 133,7 @@ A: This is in my opinion the best way to have a screensplit. It looks very tidy, it is easy to program and doesn’t require any ‘special’ (read: difficult or processor-dependant) timing. Basically, there will just be a black line at the spot of the split, but you won’t have to put that line in your images, although you ofcourse have to keep it into account when drawing them and designing your screen’s layout.</p>

<p>Some example code, which assumes the interrupts are disabled, and doesn’t switch back the status register afterwards:</p>

-<pre>
-;
+<pre>;
; A macro definition which waits until the end of the next/current HBLANK...
;
Wait_HBLANK:

```
`M articles/taste_of_vhdl.php +3 -6`
```@@ 140,8 140,7 @@ while another one would need less space
The <code>&lt;=</code> symbol denotes signal assignement.</p>

-<pre>
-Library IEEE;
+<pre>Library IEEE;
use IEEE.std_logic_1164.all;

@@ 245,8 244,7 @@ The declarations made there provide our
and stimuli values.</p>

-<pre>
-Library IEEE;
+<pre>Library IEEE;
use IEEE.std_logic_1164.all;

@@ 319,8 317,7 @@ creating the netlist.
This is accomplished by configuration statements.</p>

-<pre>
-CONFIGURATION nand_test_dataflow OF SimBox IS
+<pre>CONFIGURATION nand_test_dataflow OF SimBox IS
FOR test_nand
FOR my_nand_gate : nand_gate
use entity work.nand_gate(dataflow);

```
`M articles/vdp_commands_speed.php +2 -4`
```@@ 99,8 99,7 @@

<p>Speed is indicated in bytes per interrupt.</p>

-<pre>
-     LMMM  accuracy: 16              HMMM  accuracy: 32              YMMM  accuracy: 32
+<pre>     LMMM  accuracy: 16              HMMM  accuracy: 32              YMMM  accuracy: 32

Spr / Lin  - Speed 50/60Hz      Spr / Lin  - Speed 50/60Hz      Spr / Lin  - Speed 50/60Hz

@@ 124,8 123,7 @@

<h2 id="fillresults">The tables for the fills</h2>

-<pre>
-     LMMV  accuracy: 16              HMMV  accuracy: 64
+<pre>     LMMV  accuracy: 16              HMMV  accuracy: 64

Spr / Lin  - Speed 50/60Hz      Spr / Lin  - Speed 50/60Hz

```
`M articles/vdp_guide.php +1 -2`
```@@ 49,8 49,7 @@

<p>Many who designed or converted graphics on the PC have probably not considered these two ‘complications’, and instead used a gliding blue scale and no gamma correction, which would yield somewhat inaccurate results. I have however put a <a href="downloads/articles/sc8_palette_srgb.zip">proper palette</a> online for your convenience. It used to be a ‘wrong’ one too, so if you downloaded it before, do it again.</p>

-<pre>
-10 SCREEN 8:SETPAGE 1,1:LINE (0,0)-(255,211),&amp;B11111111,BF
+<pre>10 SCREEN 8:SETPAGE 1,1:LINE (0,0)-(255,211),&amp;B11111111,BF
20 SCREEN 1:KEY OFF:COLOR 15,4,0:COLOR=(4,5,5,5)  'change to (4,4,4,4), etc
30 VDP(0)=0:VDP(2)=6
40 IF INKEY\$="" GOTO 40

```
`M articles/vdp_tut.php +11 -22`
```@@ 83,8 83,7 @@

<p>Anyways, let’s talk about how to actually write to them registers ;). The VDP registers can be addressed in two ways, direct and indirect. Usually the direct way is used, but the indirect method is also practical in some situations. For direct register access, what you have to do is write the value to port #99 first, and then write the register number with bit 8 set (in other words, +128). Here is the method definition from the v9938 application manual:</p>

-<pre>
-                     MSB  7   6   5   4   3   2   1   0  LSB
+<pre>                     MSB  7   6   5   4   3   2   1   0  LSB
+---+---+---+---+---+---+---+---+
Port #1 First byte   |D7 |D6 |D5 |D4 |D3 |D2 |D1 |D0 | DATA
+===+===+===+===+===+===+===+===+

@@ 94,8 93,7 @@

<p>So the actual code with which you change a register’s value will look something like this:</p>

-<pre>
-    ld a,value
+<pre>    ld a,value
di
out (#99),a
ld a,regnr + 128

@@ 116,8 114,7 @@ consecutive <code>OUTI</code> or <code>O

<p>There is also the other method of addressing the registers, which is, as said before, the indirect method. This means that you can specify the register to write to once, and then repeatedly write values, which is about twice as fast. However the register needs to be the same for all values, or it has to be a successive range of registers (indirect register writing supports auto incrementing). Indirect register writing is done by writing the register number to r#17, also specifying whether to auto-increment, and then writing the desired values to port #9B:</p>

-<pre>
-                  MSB  7   6   5   4   3   2   1   0  LSB
+<pre>                  MSB  7   6   5   4   3   2   1   0  LSB
+---+---+---+---+---+---+---+---+
Register #17       |AII| 0 |R5 |R4 |R3 |R2 |R1 |R0 | REGISTER #
+-+-+---+---+---+---+---+---+---+

@@ 131,8 128,7 @@ consecutive <code>OUTI</code> or <code>O

<p>Code example:</p>

-<pre>
-    ld a,regnr      ; add +128 for no auto increment
+<pre>    ld a,regnr      ; add +128 for no auto increment
di
out (#99),a
ld a,17 + 128

@@ 153,8 149,7 @@ consecutive <code>OUTI</code> or <code>O

<p>In order to read a status register one needs to write the number of the status register in r#15, and after that the status register’s value can be read from port #99:</p>

-<pre>
-                  MSB  7   6   5   4   3   2   1   0  LSB
+<pre>                  MSB  7   6   5   4   3   2   1   0  LSB
+---+---+---+---+---+---+---+---+
Register #15       | 0 | 0 | 0 | 0 |S3 |S2 |S1 |S0 | Status register
+===+===+===+===+===+===+===+===+

@@ 166,8 161,7 @@ consecutive <code>OUTI</code> or <code>O

<p>Some example code to read out a status register:</p>

-<pre>
-    ld a,statusregnr
+<pre>    ld a,statusregnr
di
out (#99),a
ld a,15 + 128

@@ 203,8 197,7 @@ consecutive <code>OUTI</code> or <code>O
<p>The setting of the upper three bits in register 14 was added in the v9938 VDP (as opposed to the MSX1 TMS9918A) because of the larger amount of VRAM, 128kb instead of 16kb, and hence the larger addressing space. Anyways, those bits need to be set first, and then the bits 0-13 have to be written using two consecutive <code>OUT</code>s to port #99. To clarify a bit more:</p>

-<pre>
-                   MSB  7   6   5   4   3   2   1   0  LSB
+<pre>                   MSB  7   6   5   4   3   2   1   0  LSB
+---+---+---+---+---+---+---+---+
Register #14        | 0 | 0 | 0 | 0 | 0 |A16|A15|A14| VRAM access base

@@ 291,8 284,7 @@ Finally, note that all V9938 timings als

<p>Here are example routines to set the VDP for reading/writing the VRAM:</p>

-<pre>
-;
+<pre>;
; Enables the interrupts
;

<p>Here is the <code>DoCopy</code> routine, read the small source code article <a href="/sources/docopy.php">about <code>DoCopy</code></a> on how to speed it up a little more.</p>

-<pre>
-;
+<pre>;
; Fast DoCopy, by Grauw
; In:  HL = pointer to 15-byte VDP command data
; Out: HL = updated

<p>Here is an example SetPalette routine. The <code>OTIR</code> could be unrolled to <code>OUTI</code>s if you really need the additional speed (for example on a screensplit).</p>

-<pre>
-;
+<pre>;
; Set the palette to the one HL points to...
; Modifies: AF, BC, HL (=updated)
; Enables the interrupts.

@@ 436,8 426,7 @@ SetPalette:

<p>This is a small example of a short program which combines most techniques. Do with it whatever you want, look at it, try it out, ignore it... ^_^. It isn’t exactly the summum of speed and optimized code, but ahwell... It will do.</p>

-<pre>
-;
+<pre>;
; Is supposed to run in screen 5, so you should make a small BASIC loader,
; or call the CHMOD BIOS routine.
;

```
`M resources/dos2_environment.php +16 -28`
```@@ 19,8 19,7 @@
<p style="text-align:center">for MSX 2 computers</p>

-<pre>
-     CONTENTS                                         Page
+<pre>     CONTENTS                                         Page

1.   <a href="#c1">Introduction</a> ...............................  3

@@ 149,8 148,7 @@ 6.   <a href="#c6">Errors</a> ..........

<p>On entry, various parameter areas are set up for the transient program in the first 256 bytes of RAM.  The layout of this area is as below and is compatible with MSX-DOS 1 and with CP/M apart from the area used for MSX slot switching calls.</p>

-<pre>
-      +-------+-----+------+------+------+------+------+------+
+<pre>      +-------+-----+------+------+------+------+------+------+
0000h |    Reboot entry    |  Reserved   |   MSX-DOS entry    |
+------+------+------+------+------+------+------+------+
0008h |  RST 08h not used         | RDSLT routine entry point |

@@ 244,8 242,7 @@ 00F8h |

<p>The entries in the BIOS jump vector are as below:</p>

-<pre>
-   xx00h - JMP  WBOOT     ;Warm boot
+<pre>   xx00h - JMP  WBOOT     ;Warm boot
xx03h - JMP  WBOOT     ;Warm boot
xx06h - JMP  CONST     ;Console status
xx09h - JMP  CONIN     ;Console input

@@ 418,8 415,7 @@ 00F8h |

<p>A FIB is a 64 byte area of the user's memory which contains information about the directory entry on disk for a particular file or sub-directory. The information in a FIB is filled in by the new MSX-DOS "find" functions ("find first entry" (function 40H), "find new entry" (function 42H) and "find next entry" (function 41H)). The format of a File Info Block is as follows:</p>

-<pre>
-     0 - Always 0FFh
+<pre>     0 - Always 0FFh
1..13 - Filename as an ASCIIZ string
14 - File attributes byte
15..16 - Time of last modification

@@ 467,8 463,7 @@ 26..63 - Internal information, must not

<p>The time of last modification is encoded into two bytes as follows:</p>

-<pre>
-     Bits 15..11 - HOURS (0..23)
+<pre>     Bits 15..11 - HOURS (0..23)
Bits 10...5 - MINUTES (0..59)
Bits  4...0 - SECONDS/2 (0..29)
</pre>

@@ 476,8 471,7 @@ 26..63 - Internal information, must not

<p>The date of last modification is encoded into two bytes as follows.  If all bits are zero then there is no valid date set.</p>

-<pre>
-     Bits 15...9 - YEAR (0..99 corresponding to 1980..2079)
+<pre>     Bits 15...9 - YEAR (0..99 corresponding to 1980..2079)
Bits  8...5 - MONTH (1..12 corresponding to Jan..Dec)
Bits  4...0 - DAY (1..31)
</pre>

@@ 738,8 732,7 @@ 26..63 - Internal information, must not

<p>The mapper support routines use some variables in the DOS system area.  These tables may be referred and used by the user programs for the various purposes, but must not be altered.  The contents of the tables are as follows:</p>

-<pre>
+0		Slot address of the mapper slot.
+1		Total number of 16k RAM segments. 1...255 (8...255 for the primary)
+2		Number of free 16k RAM segments.

@@ 751,8 744,7 @@ address			function

<p>A program uses the mapper support code by calling various subroutines.  These are accessed through a jump table which is located in the MSX-DOS system area.  The contents of the jump table are as follows:</p>

-<pre>
+0H	ALL_SEG     Allocate a 16k segment.
+3H	FRE_SEG     Free a 16k segment.

@@ 780,8 772,8 @@ address	entry name	function
<p>The functions available in the mapper support extended BIOS are:</p>

<p>* Get mapper variable table</p>
-<pre>
-	Parameter:	A = 0
+
+<pre>	Parameter:	A = 0
D = 4 (device number of mapper support)
E = 1
Result:		A = slot address of primary mapper

@@ 790,8 782,8 @@ address	entry name	function
</pre>

<p>* Get mapper support routine address</p>
-<pre>
-	Parameter:	A = 0
+
+<pre>	Parameter:	A = 0
D = 4
E = 2
Result:		A = total number of memory mapper segments

@@ 843,8 835,7 @@ address	entry name	function
<p>An error from "allocate segment" usually indicates that there are no free segments, although it can also mean that an invalid parameter was passed in register A and B.  An error from "free segment" indicates that the specified segment number does not exist or is already free.</p>

-<pre>
-ALL_SEG - Parameters:   A=0 => allocate user segment
+<pre>ALL_SEG - Parameters:   A=0 => allocate user segment
A=1 => allocate system segment
B=0 => allocate primary mapper
B!=0 => allocate

@@ 876,8 867,7 @@ FRE_SEG - Parameters:   A=segment number
<p>The top two bits of the address are ignored and the data will be always read or written via page-2, since the segment number specifies a 16k segment which could appear in any of the four pages.  The data will be read or written from the correct segment regardless of the current paging or slot selection in page-0 or page-1, but note that the mapper RAM slot must be selected in page-2 when either of these routines are called.  This is so that the routines do not have to do any slot switching and so can be fast.  Also the stack must not be in page-2.  These routines will return disabling interrupts.</p>

-<pre>
-RD_SEG -  Parameters:   A = segment number to read from
+<pre>RD_SEG -  Parameters:   A = segment number to read from
HL = address within this segment
Results:      A = value of byte at that address
All other registers preserved

@@ 909,8 899,7 @@ WR_SEG -  Parameters:   A = segment numb
<p>Parameters cannot be passed in registers IX, IY, AF', BC', DE' or HL' since these are used internally in the routine.  These registers will be corrupted by the inter-segment call and may also be corrupted by the called routine.  All other registers (AF, BC, DE and HL) will be passed intact to the called routine and returned from it to the caller.</p>

-<pre>
-CAL_SEG - Parameters: IY = segment number to be called
+<pre>CAL_SEG - Parameters: IY = segment number to be called
AF, BC, DE, HL passed to called routine
Other registers corrupted

@@ 944,8 933,7 @@ CALLS -   Parameters:  AF, BC, DE, HL pa
<p>Another pair of routines ("GET_PH" and "PUT_PH") is provided which are identical in function except that the page is specified by the top two bits of register H.  This is useful when register HL contains an address, and these routines do not corrupt register HL.  "PUT_PH" will never alter the page-3 register.</p>

-<pre>
-PUT_Pn -  Parameters:   n = 0,1,2 or 3 to select page
+<pre>PUT_Pn -  Parameters:   n = 0,1,2 or 3 to select page
A = segment number
Results:      None
All registers preserved

```
`M resources/msx_io_ports.php +1 -2`
```@@ 514,8 514,7 @@ computers:</p>
</table>

-<pre>
-    in a,(40H)
+<pre>    in a,(40H)
cpl
push af
ld a,8

```
`M sources/docopy.php +3 -6`
```@@ 14,8 14,7 @@

<p>The first time I wrote a routine to send a command to the VDP, it was the routine DoCopy, by Stefan Boer, published in Sunrise Special. Later on, as my programming skills improved I step by step improved the routine. Now, this is pretty much the end result, as fast as I could think of. Well, actually, when I really want to do a lot of copies, I usually use an even faster variant, with a faster response time. More about that after the first docopy routine.</p>

-<pre>
-;
+<pre>;
; Fast DoCopy, by Grauw
; In:  HL = pointer to 15-byte VDP command data
; Out: HL = updated

<p>I sometimes find this useful too, a separate routine to wait for the VDP to finish its current command.</p>

-<pre>
-;
+<pre>;
; This lil' routine waits until the VDP is done copying.
;

<p>And now, the faster variant of this routine. This one doesn’t switch between status register 0 and status register 2 anymore, which greatly improves the response time, very useful when you need to execute lots and lots of VDP commands consecutively. I believe Cas Cremer’s Core Dump w.i.p. gave me the idea. In any case, this one does not switch status registers anymore, however that will require status register 2 to be set all the time. Fortunately this is not much of a problem when you have your own interrupt handler set-up.</p>

-<pre>
-;
+<pre>;
; Faster again!!! DoCopy, by Grauw
; In:  HL = pointer to 15-byte VDP command data
; Out: HL = updated

```
`M sources/load_screen.php +10 -20`
```@@ 16,8 16,7 @@

<p>First, I’ll give an example of how the LoadScreen routine should be called:</p>

-<pre>
-;
+<pre>;
; Load screen FILENAME at #18000 (page 3 in screen 5)
;
Start:        ld      de,FILENAME

@@ 36,8 35,7 @@ FILENAME:     DB      "IMAGE.SC5",0

<p>First of all, some definitions are made. I go by the book, and the book is called MSX-DOS 2 in this case. These are the official naming conventions, so ah, why not just use them. Also, the temporary area is defined here. You can place it anywhere you want, and change the size to whatever value you want (clearly, larger values give better results) by modifying the given constants.</p>

-<pre>
-;
+<pre>;
;=================================================================
;=================================================================

@@ 60,8 58,7 @@ TEMP1_SIZE:   equ     #4000         ;siz

<p>Following is the routine that loads the entire file into the VRAM. This means it keeps on reading bytes and sending it to the VRAM until DOS returns a .EOF ‘End Of File’ error. The first part is the initializer, which a. sets the VDP’s VRAM Write start address by calling the SetVDP_Write routine (for details about that, see below), b. opens the file, and c. (optionally) skips the first 7 bytes of the file. Here I say optionally, because I actually left that part commented out, since I myself don’t use files in the BSAVE format (I convert all files to the pure VRAM contents) (so I’ve also got a separate palette).</p>

-<pre>
-;
+<pre>;
;Load the entire file into the VRAM
;
;DE  = filenamenaam

@@ 85,8 82,7 @@ Load_Screen:  call    SetVdp_Write

<p>This next part actually loads the file from disk. The file handle was put directly inside the LD B,n instruction register by the previous piece of code, which is a little faster. Note that in this case, when loading files from a physical medium, a few extra T-states don’t *really* matter, and it’s actually better to use a more ‘tidy’ solution. But, I like self-modifying code, so alas, I couldn’t resist.</p>

-<pre>
ld      de,TEMP1
ld      b,0           ;self-modifying
LS_FHANDLE:   equ     \$-1

@@ 100,8 96,7 @@ LS_FHANDLE:   equ     \$-1

<p>Now comes the (somewhat) neat part. This is the part that actually sends the data from the buffer to the VRAM. Now, in this case even single T-States *DO* matter, because this concerns a loop which is called numerous times. So, this part is highly optimized using techniques from my ‘<a href="?p=articles/fast_loops.html">Fast Loops</a>’ article.</p>

-<pre>
-              dec     hl            ;Use "Mystery Fast Loop Calculus"
+<pre>              dec     hl            ;Use "Mystery Fast Loop Calculus"
ld      b,l           ;(result: loop HL (=nr of bytes read) times)
inc     b
ld      a,h

@@ 116,8 111,7 @@ LoadScr_LdLp: otir

<p>And finally, the end of the routine. This part is jumped to when an .EOF is returned by the disk read routines, meaning the end of the file has been reached. Commented out is a part which also sets the stored palette (for BSAVE format files, see the separate palette file loader for more info on that), and after that the file is closed.</p>

-<pre>
;             ld      de,0          ;Move file handle pointer
;             ld      hl,#7680 + 7  ; to palette start address
;             ld      a,(LS_FHANDLE)

<p>Last, there’s also the error handling routine, which exits to dos with the appropriate error message.</p>

-<pre>
-;
+<pre>;
;Handle an error...
;Returns to DOS 2 which will show the error.
;

@@ 162,8 155,7 @@ Error:

<p>This routine loads a file from disk, which should be 32 bytes in size, and contains the palette, stored in the same format as the VDP uses (also the same format Basic stores at address #7680). Basically it doesn’t do much interesting - just loads the data into the temporary buffer, then calls a function which sets that palette.</p>

-<pre>
-;
+<pre>;
;
;DE = filename

@@ 191,8 183,7 @@ Load_Palet:   ld      c,_OPEN   ;Open Fi

<p>The following routines are the VDP routines which are called from the Load functions. The first sets the start address of the VDP VRAM I/O port (port #98) to the specified value in registers A:HL (where bit 0 of register A contains bit 17 of the address).</p>

-<pre>
-;
+<pre>;
;
;=================================================================
;= VDP-Routines ==================================================

@@ 223,8 214,7 @@ SetVdp_Write: rlc     h

<p>And second, there’s the routine which sends a new palette to the VDP, which is read from address HL. You might notice the huge lot of DW #A3EDs... Those are actually 32 OUTI instructions. I use this notition because it is much more compact than writing 32 lines with OUTIs. Ofcouse you could also use an OTIR instruction, however OUTI is about 25% faster than OTIR.</p>

-<pre>
-;
+<pre>;
;Set the VDP's palette to the palette HL points to
;Changes: AF, BC, HL (=updated)
;

```
`M sources/raminpage1.php +3 -7`
```@@ 18,8 18,7 @@

<p>You can use this method to switch RAM:</p>

-<pre>
-ENASLT: EQU #0024
+<pre>ENASLT: EQU #0024
EXPTBL: EQU #FCC1

@@ 30,8 29,7 @@ Enable_RAM:  ld     a,(RAMAD1)

<p>There is a slightly better method to switch RAM, by selecting the same slot in page 1 as you have in page 3, which will always be the system RAM. This will work on any MSX, even without DiskROM, provided there is at least 48kB of RAM available ofcourse ^_^. It can be done as follows:</p>

-<pre>
-Enable_RAM2: ld     a,(EXPTBL+3)
+<pre>Enable_RAM2: ld     a,(EXPTBL+3)
ld     b,a                 ;check if slot is expanded
and    a
jp     z,Ena_RAM2_jp

@@ 55,11 53,9 @@ Ena_RAM2_jp: in     a,(#A8)

<p>And, before returning to Basic, don’t forget to switch back the Basic ROM:</p>

-<pre>
-Enable_ROM:  ld     a,(EXPTBL)
+<pre>Enable_ROM:  ld     a,(EXPTBL)
ld     h,#40
call   ENASLT
-
</pre>

<p>~Grauw</p>

```
`M sources/vdp_rout.php +1 -2`
```@@ 14,8 14,7 @@

<p>These routines are just provided as-is, since I don’t really feel like explaining them all. Most of them are either a. already explained somewhere else, b. obvious, or c. may give you a little challenge to try and understand them. They are designed for use from a DOS environment, but most of them can also be used from Basic...</p>

-<pre>
-;
+<pre>;
;==============================================================================
;= VDP-Routines ===============================================================
;==============================================================================

```