The DVID3B2 subroutine - Elite on the BBC Micro and NES

       Name: DVID3B2                                                 [Show more]
       Type: Subroutine
   Category: Maths (Arithmetic)
    Summary: Calculate K(3 2 1 0) = (A P+1 P) / (z_sign z_hi z_lo)
  Deep dive: Shift-and-subtract division
    Context: See this subroutine in context in the source code
 Variations: See code variations for this subroutine in the different versions
 References: This subroutine is called as follows:
             * PLANET calls DVID3B2
             * PLS1 calls DVID3B2
             * PLS6 calls DVID3B2


 Calculate the following:

   K(3 2 1 0) = (A P+1 P) / (z_sign z_hi z_lo)

 The actual division here is done as an 8-bit calculation using LL31, but this
 routine shifts both the numerator (the top part of the division) and the
 denominator (the bottom part of the division) around to get the multi-byte
 result we want.

 Specifically, it shifts both of them to the left as far as possible, keeping a
 tally of how many shifts get done in each one - and specifically, the
 difference in the number of shifts between the top and bottom (as shifting
 both of them once in the same direction won't change the result). It then
 divides the two highest bytes with the simple 8-bit routine in LL31, and
 shifts the result by the difference in the number of shifts, which acts as a
 scale factor to get the correct result.


 Returns:

   K(3 2 1 0)           The result of the division

   X                    X is preserved


.DVID3B2

 STA P+2                \ Set P+2 = A

 LDA INWK+6             \ Set Q = z_lo, making sure Q is at least 1
 ORA #1
 STA Q

 LDA INWK+7             \ Set R = z_hi
 STA R

 LDA INWK+8             \ Set S = z_sign
 STA S

.DVID3B

                        \ Given the above assignments, we now want to calculate
                        \ the following to get the result we want:
                        \
                        \   K(3 2 1 0) = P(2 1 0) / (S R Q)

 LDA P                  \ Make sure P(2 1 0) is at least 1
 ORA #1
 STA P

 LDA P+2                \ Set T to the sign of P+2 * S (i.e. the sign of the
 EOR S                  \ result) and store it in T
 AND #%10000000
 STA T

 LDY #0                 \ Set Y = 0 to store the scale factor

 LDA P+2                \ Clear the sign bit of P+2, so the division can be done
 AND #%01111111         \ with positive numbers and we'll set the correct sign
                        \ below, once all the maths is done
                        \
                        \ This also leaves A = P+2, which we use below

.DVL9

                        \ We now shift (A P+1 P) left until A >= 64, counting
                        \ the number of shifts in Y. This makes the top part of
                        \ the division as large as possible, thus retaining as
                        \ much accuracy as we can.  When we come to return the
                        \ final result, we shift the result by the number of
                        \ places in Y, and in the correct direction

 CMP #64                \ If A >= 64, jump down to DV14
 BCS DV14

 ASL P                  \ Shift (A P+1 P) to the left
 ROL P+1
 ROL A

 INY                    \ Increment the scale factor in Y

 BNE DVL9               \ Loop up to DVL9 (this BNE is effectively a JMP, as Y
                        \ will never be zero)

.DV14

                        \ If we get here, A >= 64 and contains the highest byte
                        \ of the numerator, scaled up by the number of left
                        \ shifts in Y

 STA P+2                \ Store A in P+2, so we now have the scaled value of
                        \ the numerator in P(2 1 0)

 LDA S                  \ Set A = |S|
 AND #%01111111

\BMI DV9                \ This label is commented out in the original source

.DVL6

                        \ We now shift (S R Q) left until bit 7 of S is set,
                        \ reducing Y by the number of shifts. This makes the
                        \ bottom part of the division as large as possible, thus
                        \ retaining as much accuracy as we can. When we come to
                        \ return the final result, we shift the result by the
                        \ total number of places in Y, and in the correct
                        \ direction, to give us the correct result
                        \
                        \ We set A to |S| above, so the following actually
                        \ shifts (A R Q)

 DEY                    \ Decrement the scale factor in Y

 ASL Q                  \ Shift (A R Q) to the left
 ROL R
 ROL A

 BPL DVL6               \ Loop up to DVL6 to do another shift, until bit 7 of A
                        \ is set and we can't shift left any further

.DV9

                        \ We have now shifted both the numerator and denominator
                        \ left as far as they will go, keeping a tally of the
                        \ overall scale factor of the various shifts in Y. We
                        \ can now divide just the two highest bytes to get our
                        \ result

 STA Q                  \ Set Q = A, the highest byte of the denominator

 LDA #254               \ Set R to have bits 1-7 set, so we can pass this to
 STA R                  \ LL31 to act as the bit counter in the division

 LDA P+2                \ Set A to the highest byte of the numerator

.LL31new

 ASL A                  \ This contains the code from the LL31 routine, so
 BCS LL29new            \ this section is exactly equivalent to a JSR LL31
 CMP Q                  \ call, but is slightly faster as it's been inlined,
 BCC P%+4               \ so it calculates:
 SBC Q                  \
 ROL R                  \   R = 256 * A / Q
 BCS LL31new            \     = 256 * numerator / denominator
 JMP LL312new

.LL29new

 SBC Q                  \ This is also part of the inline LL31 routine
 SEC
 ROL R
 BCS LL31new
 LDA R

.LL312new

                        \ The result of our division is now in R, so we just
                        \ need to shift it back by the scale factor in Y

 LDA #0                 \ Set K(3 2 1) = 0 to hold the result (we populate K
 STA K+1                \ next)
 STA K+2
 STA K+3

 TYA                    \ If Y is positive, jump to DV12
 BPL DV12

                        \ If we get here then Y is negative, so we need to shift
                        \ the result R to the left by Y places, and then set the
                        \ correct sign for the result

 LDA R                  \ Set A = R

.DVL8

 ASL A                  \ Shift (K+3 K+2 K+1 A) left
 ROL K+1
 ROL K+2
 ROL K+3

 INY                    \ Increment the scale factor in Y

 BNE DVL8               \ Loop back to DVL8 until we have shifted left by Y
                        \ places

 STA K                  \ Store A in K so the result is now in K(3 2 1 0)

 LDA K+3                \ Set K+3 to the sign in T, which we set above to the
 ORA T                  \ correct sign for the result
 STA K+3

 RTS                    \ Return from the subroutine

.DV13

                        \ If we get here then Y is zero, so we don't need to
                        \ shift the result R, we just need to set the correct
                        \ sign for the result

 LDA R                  \ Store R in K so the result is now in K(3 2 1 0)
 STA K

 LDA T                  \ Set K+3 to the sign in T, which we set above to the
 STA K+3                \ correct sign for the result

 RTS                    \ Return from the subroutine

.DV12

 BEQ DV13               \ We jumped here having set A to the scale factor in Y,
                        \ so this jumps up to DV13 if Y = 0

                        \ If we get here then Y is positive and non-zero, so we
                        \ need to shift the result R to the right by Y places
                        \ and then set the correct sign for the result. We also
                        \ know that K(3 2 1) will stay 0, as we are shifting the
                        \ lowest byte to the right, so no set bits will make
                        \ their way into the top three bytes

 LDA R                  \ Set A = R

.DVL10

 LSR A                  \ Shift A right

 DEY                    \ Decrement the scale factor in Y

 BNE DVL10              \ Loop back to DVL10 until we have shifted right by Y
                        \ places

 STA K                  \ Store the shifted A in K so the result is now in
                        \ K(3 2 1 0)

 LDA T                  \ Set K+3 to the sign in T, which we set above to the
 STA K+3                \ correct sign for the result

 RTS                    \ Return from the subroutine

[X]

Label DV12 is local to this routine

[X]

Label DV13 is local to this routine

[X]

Label DV14 is local to this routine

[X]

Label DVL10 is local to this routine

[X]

Label DVL6 is local to this routine

[X]

Label DVL8 is local to this routine

[X]

Label DVL9 is local to this routine

[X]

Variable INWK in workspace ZP

The zero-page internal workspace for the current ship data block

[X]

Variable K in workspace ZP

Temporary storage, used in a number of places

[X]

Label LL29new is local to this routine

[X]

Label LL312new is local to this routine

[X]

Label LL31new is local to this routine

[X]

Variable P in workspace ZP

Temporary storage, used in a number of places

[X]

Variable Q in workspace ZP

Temporary storage, used in a number of places

[X]

Variable R in workspace ZP

Temporary storage, used in a number of places

[X]

Variable S in workspace ZP

Temporary storage, used in a number of places

[X]

Variable T in workspace ZP

Temporary storage, used in a number of places

Maths (Arithmetic): DVID3B2

[BBC Master version]