## ../ -- 1d_design.html

updated 2005-04-08

contents:

This is just my quick reference for series approximations and mathematical functions.

Similar information can be found in:

You'll probably find the above references more understandable.

Lots of technical mathematical things here. Bit sequences. DSP. Calculus. infinite series. Etc.

"Whatever Happened to Numbers?" by Robert W. Lucky _IEEE Spectrum_ March 1994 http://www.argreenhouse.com/papers/rlucky/spectrum/numbers.shtml says that numbers are over-emphasized in school. [FIXME: does this fit better over at bignums.html ?]

"Not everything that can be counted counts, and not everything that counts can be counted."

-- Albert Einstein

function approximation

DAV is most interested in "simple" approximations: very quick methods that get you within an order-of-magnitude.

I'm also interesed in "functional software compression", so occasionally I start with a program that calculates things to very high precision, and mangle it until it *still* calculates things to just as high a precision, but it takes much less ROM space (or, in some cases, the source code is much shorter ).

[FIXME: cross-link to appropriate places in data_compression.html and machine_vision.html] [FIXME: seach for ``approx'' and ``math'' and gather appropriate stuff here]

## paper sizes (aspect ratio)

"Postcard Dimensions: Each card (each stamped card or postcard or each half of a double stamped card or postcard) claimed at a card rate must be:

• a. Rectangular.
• b. Not less than 3-1/2 inches high, 5 inches long, and 0.007 inch thick.
• c. Not more than 4-1/4 inches high, or more than 6 inches long, or greater than 0.016 inch thick."
-- http://pe.usps.com/text/dmm300/101.htm#6_3_1

local.html#pmail mentions that First-Class envelopes weighing 1 ounce or less require an additional \$0.12 nonmachinable surcharge if any one of the following apply ... The length divided by height is less than 1.3 or more than 2.5 ... -- http://postcalc.usps.gov/MailpieceDimensions.asp (That's about height 89 mm to 108 mm; length 127 to 152 mm, right?)

DAV: It is good that the "logical" European standard paper and envelope sizes, based on 2^(1/2) = 1.414... , is in this "1.3 to 2.5" range of aspects ratios. Hm... what of my favorite numbers can be coerced into that aspect ratio ? Note that normal TV / computer monitor ratio (4:3 = 1.333) is acceptable, as is movie theatre and HDTV widescreen (16:9 = 1.77).

The only ISO 216 size in the the post card range is A6 (105 x 148) (close to 4x6 index card 101 x 152). The B7 (88 x 125) is a couple mm too small, B6 (125x176) is too large. The C7 (81x114) is too small, C6 (114x162) is too large. See 43folders:Hipster_Variants#Postcard_PDA.

Some related simple fractions and their inverses:

• 2.5 = 5/2 : least square envelope >
• 2.45... = 6^(1/2) >
• 2.333... = 7/3 >
• 2.25 = 9/4 >
• 2.236... = 5^(1/2) >
• 2.228... = 7/π >
• 2.207... = 6/e >
• 2 = 2/1 >
• 1.9099... = 6/π >
• 1.839... = 5/e >
• 1.8 = 9/5 >
• 1.778... = 16/9 widescreen >
• 1.75 = 7/4 >
• 1.75: U.S. business card: 3.5 x 2 inch (89 x 51 mm) >
• 1.732... = 3^(1/2) >
• 1.714: least square postcard: 6 inches (max) , 3.5 inches (min) >
• 1.667... = 5/3, such as 3x5 inch index card (alas, not quite high enough to be a postcard) >
• 1.6 = 8/5 , such as 5x8 inch index card (alas, a little too large in both directions to be a postcard -- but cut in half 4x5 it works -- then the "5" is the minimum possible length) >
• 1.592... = 5/π >
• 1.586... international business card (or credit card): 86 by 54 mm ( 3.37 x 2.13 inches) >
• 1.571... = π/2 >
• 1.571: (was: smallest postcard: 5.5 inches (min) , 3.5 inches (min) ) >
• 1.5 = 3/2 , such as 4x6 inch index card (which can be a postcard; the "6" is the largest possible length) >
• 1.472... = 4/e >
• 1.429... = 10/7 : smallest postcard: 5 inches (min) x 3.5 inches (min) >
• 1.414... = 2^(1/2) euro: all ISO 216 paper sizes >
• 1.412: largest postcard: 6 inches (max) , 4.25 inches (max) >
• 1.4 = 7/5 >
• 1.359... = e/2 >
• 1.333... = 4/3 TV >
• 1.3 = 13/10 : most square envelope >
• 1.294 (was: most square postcard: 5.5 inches (min) , 4.25 inches (max) ) >
• 1.176: most square postcard: 5 inches (min) , 4.25 inches (max) >
• 1 >
• 0.769... = 10/13 : most square envelope >
• 0.750 = 3/4 TV >
• 0.735... = 2/e >
• 0.714... = 5/7 >
• 0.707... = 2^(-1/2) euro >
• 0.7 = 7/10 >
• 0.6796... = e/4 >
• 0.667... = 2/3 >
• 0.636... = 2/π >
• 0.628... = π/5
• 0.625 = 5/8 >
• 0.600 = 3/5 >
• 0.577... = 3^(-1/2) >
• 0.571... = 4/7 >
• 0.563... = 9/16 widescreen >
• 0.555... = 5/9 >
• 0.544... = e/5 >
• 0.524... = π/6 >
• 0.5 = 1/2 >
• 0.453... = e/6 >
• 0.449... = π/7 >
• 0.447... = 5^(-1/2) >
• 0.444... = 4/9 >
• 0.429... = 3/7 >
• 0.408... = 6^(-1/2) >
• 0.400 = 2/5 : least square envelope

## digital codes

See

[FIXME: move to massmind ?]

• A digital code is some meanings that someone has placed on a certain arrangement of ones and zeros. In past columns, we've looked at pseudorandom codes, short portions of which appear as random noise. The Gray codes whose bits are only allowed to change one at a time. Or Huffman codes that compress data. CRC codes which can fix many of their own communication errors. Or those rare Barker codes that perfectly autocorrelate. And, of course, that ASCII coding that lets us store and transmit letters and numbers. Plus many others. I've recently run across an odd class of digital codes called binary chain codes.

-- "Hardware Hacker: New shaft encoder designs: Binary chain code secrets" by Don Lancaster, September, 1994 http://63.140.207.28/musev.pdf/hack80.pdf
• CRC codes
• In a vector map data file, we often have lots of very short distances.

Let's look at the special case where the uncompressed file represents distances as signed 32 bit quantities,

Naturally a compression algorithm tuned for 32 bit quantities will get better compression than a byte-oriented compressor. However, with a little pre-compression filtering, I think we can get almost as good compression with a lot less programming effort by using a simple pre-compression filter in front of an off-the-shelf byte-oriented compression algorithm.

Let's say the original file has length N bytes (i.e., N/4 distances, each one a 32-bit quantity),

Let's say that nearly all of the distances can be represented in 1 byte or less. (As a very crude estimate, let's say distances from -255 to +255 are evenly distributed, and larger distances are extremely rare -- so there roughly N/8 positive values and N/8 negative values).

If we use standard signed binary numbers, then we end up with lots of 0x00 and 0xff bytes (in the most-significant bytes of each distance). With standard Huffman compression, this makes one of those bytes compress to 1 bit, the other one compresses to 2 bits, and all other bytes compress to 2 or more bits (roughly, 8 or 9 bits). then the Huffman-compressed file has length

• ~N/8 * 3 bits (to represent the top 3 zero bytes of positive small values)
• ~N/8 * 6 bits (to represent the top 3 0xff bytes of negative small values)
• ~N/4 * 8.5 bits (to represent the bottom byte)
• ~8*8 bits (large values, we're assuming only a couple in the entire file)
• ---
• N*13/4 + 64 bits. (So we've reduced the 32 bit quantities to an average length of 13 bits).
However, if we first run this through a width-preserving transform (for example, one of the 2 listed below) that forces the hi bytes of small numbers to zero (for both positive and negative numbers), then we end up with lots of 0x00 bytes (in the most-significant bytes of each distance), giving
• ~N/8 * 3 bits (to represent the top 3 zero bytes of positive small values)
• ~N/8 * 3 bits (to represent the top 3 zero bytes of negative small values)
• ~N/4 * 8.5 bits (to represent the bottom byte)
• ~8*8 bits (large values, we're assuming only a couple in the entire file)
• ---
• N*23/8 + 64 bits. (So we've reduced the 32 bit quantities to an average length of 11.5 bits -- an improvement of over 10 percent).

similar to sign+magnitude, but lossless on 2's complement machines.

Here's something similar to sign magnitude (except it puts the sign bit at the least significant bit) that forces all the most-significant bytes to zero for all small values, both positive and negative. Note that this is completely lossless and width preserving -- each possible 32 bit value is mapped to a unique 32 bit code. [FIXME: is there a name for lossless width preserving transforms ?]

// assumes 2's complement machine.
// Should work no matter how wide int is -- 16 bits, 32 bits, 64 bits, whatever.

```	unsigned int signed_to_modified_signed_magnitude( signed int x ){
bool negative = ( x < 0 );
x <<= 1; // delete sign bit, shift in 0.
if(negative){
x = ~x;
};
}

signed int modified_signed_magnitude_to_signed( unsigned int x ){
bool negative = ( x & 1 );
x >>= 1; // delete sign bit, shift in zero.
if(negative){
x = ~x;
};
return( (signed)x );
}
```

A more well-known width-preserving transform is the binary-to-Gray-code converter. Here is a transform based on it that also forces the most-significant bytes to zero for all small values (positive or negative).

```unsigned int
code_from_binary( signed int b){
return( rotate_left( b2g( b ) ) );
}

signed int
binary_from_code( unsigned int c ){
return( g2b( rotate_right( c ) ) );
}
```

(The b2g() leaves the sign bit alone, but transforms all the other leading ones to zeros ... then the rotate moves the sign to the end, similar to the above alg. For small negative and also small positive numbers, the result has leading zero bytes.)

```/* find gray code from corresponding binary code */
unsigned int
b2g( signed int b )
{ /*  well-known algorithm, as implemented by David Cary */
return( b bitxor ( b && 1 ) );
}
```

Gray Code Conversion "The well-known algorithm for conversion from Gray to binary is a linear sequence of XORs that makes it seem each bit must be dealt with separately. Fortunately, ... For 32-bit Gray code values produced as described above, the conversion from Gray code back to unsigned binary is:

```/* find binary code from corresponding gray code */
unsigned int
g2b(unsigned int gray)
{ /* algorithm from http://www.aggregate.org/MAGIC/#Gray%20Code%20Conversion */
gray ^= (gray && 16);
gray ^= (gray && 8);
gray ^= (gray && 4);
gray ^= (gray && 2);
gray ^= (gray && 1);
return(gray);
}
```

Further transforms: If we don't use Huffman, but instead use a compressor that is efficient at compressing long repeated strings, perhaps "rotate" (blocks of) the file, so the first 1/4 of the file (block) holds the top byte, the next 1/4 of the file (block) holds the second byte ... and the last 1/4 of the file (block) holds the least-significant byte. A really smart byte-oriented compressor might even notice the statistics changing in each section and tune compression differently for each quarter.

[map compression][data compression][width preserving transform ?]

Tiger data base [FIXME: tiger ] stores lattitude and longitude in decimal degrees (6 decimal digits after the decimal point), although the features don't have nearly that accuracy -- DAV's test seems to indicate that data is rounded to the nearest 10 meters. [FIXME: has this changed ?]

U.S.G.S. topo maps say "at best +-167 feet" which is roughly +- 50 meters. [FIXME: has this changed ?]

... interpolation ?

## circular numbers

[FIXME: move to massmind]

## curve fitting tools

• Mifit: Code generator to approximate a data set by Salvador E. Tropea (SET) http://mifit.utic.com.ar/

Mifit is a program used to generate the assembler code needed to aproximate a data set using simple functions (like rects and cuadratics). Currently Mifit generates code only for the Microchip's PIC family of microcontrollers but the program can be adapted to generate code for other processors. We are looking for voluntaries to add support for other chips.

[FIXME: install and play with this program.] [FIXME: volunteer ?]
• Least Squares Fitting http://mathworld.wolfram.com/LeastSquaresFitting.html has a detailed explaination of fitting a set of (possibly noisy) points to a straight line. Also explains generalizing it to fitting a set of (possibly noisy) points to a polynomial, to an exponential, to a logarithm, "nonlinear least squares fitting" to a Guassian,

## interpolation

See vlsi.html#arithmetic for more on interpolating sin() and cos().

[FIXME: aren't there better explaination elsewhere on the web ?]

basically from ``Educated Guessing Games'' article by Don Morgan in _Embedded Systems Programming_ 2002-03 talks about linear interpolation, the Lagrange interpolator, and briefly mentions Chebyshev polynomials. Promises future articles with other ``ways to increase the speed and accuracy of interpolation using relatively small tables''

(fixing a few typographic glitches in my paper copy of the magazine)

Many times we know the value of a function at a few selected points, and we want to interpolate.

Weierstrass's theorem: if x0, x1, x2, ... xn are distinct, then for any y0, y1, y2, ... yn there exists a unique polynomial of degree n or less such that

F(x[i]) == y[i], for i = 0,1,2,...,n.

(i.e., the unique polynomial of degree n that passes through n+1 data points).

We break that function down into parts

F(x) = F0(x) + F1(x) + F2(x) + ... + Fn(x)

where each part gives exactly one sample point -- F0(x) = y, Fn(x[n]) = y[n], etc. -- but each function is exactly equal to 0 at all other sample points x[j].

First define a polynomial that is zero for all sample points except one. In other words,

fi(x[j]) == 0, for j = 0,1,2,...,n except for i=j.

That polynomial is

fi(x) = Π(j = 0 to n; except for j=i)( x - x[j] );

At the point where x = x[i], that polynomial equals the value

fi(x[i]) = Π(j = 0 to n; except for j=i)( x[i] - x[j] );

We use this value to normalize the polynomial to give us exactly y[i] at x = x[i]:

Fi(x) = fi(x) / fi(x[i]) = Π(j = 0 to n; except for j=i)( ( x - x[j] ) / ( x[i] - x[j] ) );

```
#define n 10
WARNING: untested code

float x[n] = { .... }; // sample locations
float y[n] = { .... }; // sample values

float evaluate_lagrange( float t ){
float sum = 0;
for( i = 0; i < n; i++){
float product = 1.0;
for( j = 0; j < n; j++){
if( i != j){
product *= (t - x[j]) / (x[i] - x[j]);
}
}
sum += product * y[i];
}
return(sum);
}

```

If we pre-calculate the intermediate products, this can run very fast.

```
#define n 10
WARNING: untested code

float x[n] = { .... };
float g[n] = { .... }; // pre-computed intermediate products

float evaluate_lagrange( float t ){
float sum = 0;
for( i = 0; i < n; i++){
float product = 1.0;
for( j = 0; j < n; j++){
if( i != j){
product *= t - x[j];
}
}
sum += product * g[i];
}
return(sum);
}

```

This gives n*n multiplies per evaluation.

## fixed point function library; floating point function library; trig library

How to implement floating point multiplies, the sin() function, etc. Also has some information on "fixed point", which runs much faster than "floating point" on *some* hardware.

• "Floating Point Approximations" paper by Jack Ganssle http://ganssle.com/approx.htm This paper ... gives a number of simple floating point trig approximations. The code includes test cases to check for maximum error. free source code for sin(), cos(), tan(), atan(), and a test stub, in low precision ("3.2 digits of accuracy") and high precision ("12.1 digits of accuracy") versions.

DAV: the test only checks for integer degrees. Sometimes DAV also wants high precision for something less than 1 degree.

• http://wehner.org/fpoint/ has tips for implementing square_root(), cosine(), sine(), arctangent(), logarithm(), antilogarithm() == exp(), finding the Chebyshev polynomial to approximate any function, gamma function, "Wehner's Harmonic Function", with examples in 80x86 assembly-language code.

### logarithm

fixed point logarithm for PIC. ; Input: xhi:xlo unsigned Q0.16 (modified) ; Output: x2hi:x2lo unsigned Q6.10 (implicit minus) i.e., 0 <= x < 1. http://www.piclist.com/techref/microchip/math/power/16lr-ng.htm Reduces range to 1/2 <= x < 1 then uses interpolated lookup table (17 values, 10 bit numbers). 8-bit logarithm algorithm PIC assembly source code http://www.dattalo.com/technical/software/pic/piclog.html "based on a table look-up plus first order interpolation"

Logarithms by Scott Dattalo Here's a collection of different ways to compute logarithms. http://www.dattalo.com/technical/theory/logs.html

Here's a completely different approach I've been thinking about for approximating ln(x) for large integers: since ln(x)-ln(x-1) is very roughly equal to 1/x for large x (... perhaps closer to 1/(x-1/2) = 2/(2x-1) ...) and ln(2*x) = ln(2) + ln(x), maybe it would be simple to do this:

```  float ln( unsigned int x ){

const float ln2 = log(2);
int powers_of_2 = 0;
float sum = 0;

while( 2 <= x ){

if( x & 1 ){
sum += 1.0 / (float)x ;
};
x = x >> 1;
powers_of_2++;
};
return( ln2 * powers_of_2 + sum );
}
```

or perhaps

```  float ln( unsigned int x ){

const float ln2 = log(2);
int powers_of_2 = 0;
float sum = 0;

while( 2 <= x ){

if( 1 == (x & 3) ){
sum += 1.0 / (float)x ;
x-- ;
};

if( x && 1 ){
sum -= 1.0 / (float)x ;
x++;
};

x = x >> 1;
powers_of_2++;
};
return( ln2 * powers_of_2 + sum );
}
```

Hey, it gives exactly the correct answer for x = 1 and x = 2, and in fact all x = 2^n where n is a positive integer.

• "Calculate exp() and log() Without Multiplications" http://www.quinapalus.com/efunc.html "The methods are particularly suitable for use when shifting is cheap: for example in ARM assembler code" ... "Please contact me at the e-mail address on [http://www.quinapalus.com/] if you are interested in tested ARM assembler implementations of these algorithms for a particular application."
• "Fast log2" by Michael Herf http://stereopsis.com/log2.html To take the integer logarithm of an integer, uses 2 functions built into most IEEE floating-point hardware: Convert int to floating-point, then strip off the exponent (discarding the mantissa).
• Robert Lee 1999 http://homepages.borland.com/efg2lab/Library/UseNet/1999/1214a.txt fast exponentation routine ... fast, low precision exp function. ``comments and tweaks encouraged''
• http://numbers.computation.free.fr/Constants/Log2/log2.html mentions the series expansion for the natural logarithm function:
ln( 1 + x ) = x - x^2/2 + x^3/3 - x^4/4 + x^5/5 ... = sum( k=0, inf, (-1)^(k-1)*x^k / k ) \ -1 < x < 1
(very similar to the series expansion for arctan)

inverse hyperbolic tangent
arctanh(x) = (1/2)*ln( (1+x)/(1-x) ) = sum( k=0, inf, x^(2*k+1)/(2*k+1) ).

or

t = x^2 / (1-x^2);
arctanh(x) = ( x / (1-x^2) ) * ( 1 - t*2/3 + t^2*2*4/3*5 - t^3*2*4*6/3*5*7 + ... )

• f(x) = ln(x+1) = x - x^2/2 + x^3/3 - x^4/4 + x^5/5 - x^6/6 ... -1*(-x)^n/n ...

The classic Taylor series around x equals 1.

It's only convergent for 0 < x < 2. (quickly convergent (better than 1 bit per term) for -1/2 < x < 1/2)

("classic" meaning that people learning calculus -- in particular, learning about Taylor polynomials and the "absolute ratio test to determine the interval of convergence" -- often re-derive this. It's good for that learning experience, but maybe not the best for actually computing ln(x). ). More details: http://www.sosmath.com/calculus/tayser/tayser01/tayser01.html and http://spot.pcc.edu/~ssimonds/200202/m253/taylor/

• Phil Hagan (probably quoting someone else ?) used the variable substitution

z equals (x-1)/x;

i.e.,

x equals 1/(1-z)

and then found the taylor series for

ln(x) equals ln( 1/(1-z) )

around z equals 0. This appears to converge for all 0 < x (???).

x ~ epsilon: z ~ -infinity
x ~ +1: z ~ 0
x ~ +infinity: z ~ +1. (??)

ln(x) = f( (x-1) / x )

f(z) = ln( 1/(1-z) ) = z + z^2/2 + z^3/3 + z^4/4 + z^5/5 + ... + z^n/n + ....

of course this is only (slowly) convergent for -1 < z < 1, but that covers

-1/2 < x < +inf

This is more quickly convergent (1 bit per term) for -1/2 < z < 1/2, which covers 0.67 < x < 2.0.

• approximating ln(x)

DAV thinks the taylor series (around 1 ?) for

x*ln(x)

is *much* easier to approximate with a polynomial. It doesn't have that pole at 0 anymore --

lim{x-->+0}(x*ln(x)) = 0.

(of course, x < 0 is still undefined/complex).

z = x-1 ln(x) = f(x-1)/x = f(z)/(z+1). f(z) = (z+1)*ln(z+1) = x*ln(x).

f(z) = (z+1)*ln(z+1) = z + z^2/2 - z^3/6 + z^4/12 - z^5/20 + z^6/30 - z^7/42 ... .... +z^n*(-1)^n*/(n*(n-1)) ...

this looks like it's only convergent for -1 < z < +1, which covers 0 < x < 2.

Terms in this series alternate, so the error is less than the last term added. Wait a minute -- since the first derivative of this is f1(z) = ln(z+1) + 1, the derivative of the whole polynomial identical to the "classic" formula. So the range of convergence is the same. Maybe it's not so much better as I thought.

DAV:

Perhaps the combo f(z) = (1/(1-z)) * ln ( 1/(1-z) ) would be even better ? I think we want to: map the input x=0 to z=-infinity, map x=+infinity to +infinity in order to take advantage of ``smooth'' polynomial approximation. One function that does this is z = (x-1)^2 / ( x ) Is there a better mapping ? The output range (-inf at left edge to +inf at right edge, without any poles; smooth) is already good for polynomial approximation ... although the fact that d/df(ln(x)) goes to 0 as x goes to +inf is somewhat difficult to approximate. ( range-reduction algorithm to handle large x, and x very close to zero ?) since ln(x) = ln(x*k/k) = ln(x*k) - ln(k) ...

• newton approximation ?
• ============================[FIXME:]=================================== -------------------------- approximating ln(x)
• Cephes Mathematical Library http://www.netlib.org/cephes/ [FIXME: also has many other mathematical algorithms] http://www.netlib.org/cephes/singldoc.html#log Floating point math http://www.rocketaware.com/math/float/ http://www.rocketaware.com/man/math/float/ points to the ever-popular FAQ, When I set a float variable to 3.1, why is printf printing it as 3.0999999? At comp.lang.c FAQ http://www.eskimo.com/~scs/C-faq/q14.1.html . [FIXME: is this general-purpose enough to put under search#FAQs ?] [c_language ???] Incompatibilities Between ISO C and ISO C++ By David R. Tribble http://david.tribble.com/text/cdiffs.htm Fixed Point (16.16) Math http://www.cstone.net/~kyoung/fix1FAQ.html includes: How to do trigonometric functions in fixed point. He uses an integer power of 2 for 1 complete revolution. #define MAX_TRIG 1024 Then range reduction is easy ( theta &= (MAX_TRIG - 1); ). Because he wants blazing fast speed, he then just looks up the result in a table (of fixed-point values) ( y = SinTab[theta]; ). FIXME: the author says However, I have no clue on how to calculate a natural logarithm. If anyone can offer any ideas, I'll be glad to write one. Todo: suggest one. General Math Functions http://www.efg2.com/Lab/Library/Delphi/MathFunctions/General.htm ??? anything useful here ? CodeWarrior for Palm Pilot Math Tutorial http://godot.urol.uic.edu/pilotfloat.html The GNU C Library -- Arithmetic Functions ??? http://www.rocketaware.com/man/man3/ieee.3.htm Functions for IEEE arithmetic http://www.rocketaware.com/man/man3/exp.3.htm exponential, logarithm, power functions http://www.eskimo.com/~scs/C-faq/q14.4.html refers to David Goldberg, ``What Every Computer Scientist Should Know about Floating-Point Arithmetic'' Isn't that online ? If so, email Steve Summit and ask him to linkify that reference. http://www.eskimo.com/~scs/C-faq/q14.8.html I suspect the sentence

If you need pi, you'll have to #define it yourself.

should really say

If you need pi, you'll have to #define it yourself as #define PI (4*atan(1)) or const double PI = (4*atan(1)); .

### trig functions

• [given a representation for floating point numbers, how to compute trig functions using addition, multiplication, and division of a series of integers ... all the divides can be pre-computed, so that we only need to do addition and multiplication at runtime ...] ``Deriving Trig Functions; Taylor Series'' http://mathforum.org/dr.math/problems/harold.05.01.01.html ``I think you are asking how one computes quantities such as sin(32 degrees) without using a calculator.'' gives brief introduction and a couple of specific methods.

DAV: What I find odd is the popularity of ``degrees''. Most software either uses radians or "binary radians". Angles in binary radian are angles in units of 1/256 of a turn -- or some other power of 2 -- 1/(2^n).

• 1 complete rotation =
• = 360 degrees
``Implementing the Sine function on the PIC'' http://www.dattalo.com/technical/theory/sinewave.html describes ``computing sines from scratch'' (i.e., without a magic table of pre-computed values). I think the chain of events goes:

(1) Pick some smallish number ``b'' (perhaps b=1/256 of a turn = 1 brad, or b=1/120 of a turn = 3 degrees, or b = 1/(64π) of a turn = 1/32 radians.). Use Taylor series or Chebyshev series to compute sd = sin(b) and cd = cos(b) to an adequate number of bits.

(2) Use Trig Identity 1:

• sin(a+b) = sin(a)*cos(b) + cos(a)*sin(b)
• cos(a+b) = cos(a)*cos(b) - sin(a)*sin(b)
to compute an entire series of sin() and cos() values.
```	sin = 0; cos = 1.
sd = sin(b); cd = cos(b); // pre-calculated using Taylor series.
for(i=0;i<N;i++){
sin[i+1] = sin[i]*cd + cos[i]*sd;
cos[i+1] = cos[i]*cd - sin[i]*sd;
};
// Now you have a lookup table. If you want to know
// sin(17*b) (perhaps 17/32 radians or 17*3 degrees), then
// look it up at sin.
```
This can be used at run-time as-is. Or you could do this off-line to generate tables of numbers for use later.

(3). Using a table of values, use table look up plus 1st order interpolation to look up any random value.

Scott Dattalo claims ``if we have 8 equally spaced samples of the first quadrant of a sine wave, we can use interpolation to find the exact value of the sine wave to with in 1%.''

Brad Eckert claims that his code (using a table of 19 equally-spaced samples of the sine wave) "gives better than 16-bit precision". http://www.tinyboot.com/cubic.txt . It uses no divides, and uses a cubic spline to approximate sin(x) from the nearest 4 points in the table.

• A completely different method of calculating sin(x) and cos(x) relies on these identities:
```cos(2a) = 2*( cos a )^2 - 1
or
cos(3a) = 4*( cos a )^3 - 3*( cos a ) = ( cos a )*( 4*( cos a )^2 - 3 ).

sin(3a) = 3*( sin a ) - 4*( sin a )^3 = ( sin a )*( 3 - 4*( sin a )^2 ).

and for a "small enough",
sin(a) =~= a
cos(a) =~= 1 - (a^2)/2
```

(I suppose I should mention De Moivre's formula http://en.wikipedia.org/wiki/De_Moivre%27s_formula here.)

This method doesn't require any quadrant-detection code. It just repeatedly divides the angle by 3 until it is "small enough", uses the "small angle" approximation, then expands back to the final result.

It's pretty simple to write recursively ... but even for tiny PIC processors that can't handle recursive code, it looks like it would work. (I suspect that, for the same accuracy, it is slower, but requires less ROM than table-lookup code).

```// Untested code:

// note that this is non-recursive, doesn't require any tables, and doesn't use divide except for divide-by-2.
double cos( double x ){
int depth = 0;
while( epsilon < abs(x) ){ // while( x+2 != 2 ) ?
depth++;
x = x/2;
};
// now find F(x) = cos(x) - 1 -- this gives more accuracy than finding cos(x) directly
double x = -(x*x)/2; // use cos(x) =~= 1 - (x^2)/2 approximation.
while( 0 < depth ){
depth--;
x = 2*x(x+2); // cos(2a) = 2*( cos a )^2 - 1
}
// return cos(x) = F(x) + 1
return( x+1 );
}

// note that this is non-recursive, doesn't require any tables, and doesn't use divide except for divide-by-3.
double sin( double x ){
int depth = 0;
while( epsilon < abs(x) ){ // while( (3 - 4*(x*x)) != 3 ) ?
depth++;
x = x/3; // (perhaps x = x * 0.010101010101010... ?)
}
// use sin(x) =~= x approximation.
while( 0 < depth ){
depth--;
x = x*( 3 - 4*(x*x) ); // use sin(3a) = 3*( sin a ) - 4*( sin a )^3.
};

}
```

(I think I first saw this divide-by-3 idea in HAKMEM ITEM 14 -- that code is shorter and recursive. )

see #interpolation for more on interpolation.

• ``Series Approximations for Functions in Forth'' by David L. Jaffe http://guide.stanford.edu/People/jaffe/SVFIG99/trig.html

In embedded environments it is sometimes necessary to calculate a trig function. The options for doing this are:

• 1. use software floating point
• 2. use hardware floating point (requires extra hardware)
• 3. use a table (fastest; lowest precision)
• 4. use series approximation (potentially highest precision; lowest ROM requirements; slowest; )

(uses decimal fixed-point (i.e., result from calculating sin is the integer (10,000 * sin(x) ).) , rather than binary fixed-point or floating-point ... heavily oriented towards degrees, rather than radians ... (i.e., angles are in integer multiples of 1/100 of a degree, 36000 is a complete circle ...) )

algorithm summary: * break into 8 octants, use a 8-way `case` statement that calls the special-case routines ``(sin)'' or ``(arctan)'' then fixes up the result for each octant.

pi = 355/113 ; approximation

arctan accepts 2 integers n and d, internally uses 2 integers n and d to represent the fraction x = n/d = tangent(theta). Uses trig identity arctan(x) = π/2 - arctan(1/x) to swap n and d to make sure n <= d, or in other words, to make sure -1 < x < 1. calls ``(arctan)'' with n_new = min ( abs(n), abs(d) ); d_new = min (abs(n), abs(d)). then fixes up the 8 cases (8 octants): is n positive ? * is d positive ? * is abs(n) < abs(d) ?.

special case routine for arctan: (arctan) accepts 2 integers n and d, assumes that n <= d and that both are positive. returns angle arctan(n/d) in tenths-of-degree.

arctan(x) = sum(n=0, inf, (-1)^n * x^(2*n+1) / (2*n+1) ) = x - x^3/3 + x^5/5 - x^7/7 + x^9/9 - ... \ (-1 < x < 1)

scale = internal scale factor \ to allow integer arithmetic

\ 1st term = scale*(n/d)
\ 2nd term = (1st term)*(1/3)*(n/d)*(n/d)*(-1)
\ 3rd term = (2nd term)*(3/5)*(n/d)*(n/d)*(-1)
\ nth term = (n-1 term)*((2n-1)/(2n+1))*(n/d)*(n/d)*(-1)

general sin: given a 16 bit signed integer ( -32768 to +32767 ), representing 100ths of a degree, the minimum angle is about -328 degrees to +328 degrees, so when we do ``/mod'' we get 8 quadrants (each quadrant repeated twice). Once we've reduced the value to -90 < x < +90 (???), then we call ``(sin)''. No more manipulation needed after that.

special case (sin) for (-90 degrees < x < 90 degrees)

pi = 355/113 ; approximation

degree scale factor = (10000/180)*pi = (500/9)*(355/113) = 174.533

sin(x) = sum( n=0, inf, (-1)^n * (x^(2*n+1))/ (2*n+1)! ) = x - x^3/6 + x^5/120 + ... \ (-inf < x < +inf)

\ 1st term = x*(10000/180)*(355/113) = x*(500/9)*(355/113)
\ 2nd term = (1st term)*(1/6)*(x^2)*(-1)
\ 3rd term = (2nd term)*(1/20)*(x^2)*(-1)
\ nth term = (n-1 term)*(x/(2n+1))*(x/(2n))*(-1)

cos(x) = sum( n=0, inf, (-1)^n * (x^(2*n))/ (2*n)! ) = 1 - x^2/2 + x^4/24 + ... \ (-inf < x < +inf)

• (apparently from Isaac Newton 1676):
arcsin(x) = x + (x^3/3)*1/2 + (x^5/5)*1*3/2*4 + ... \ (-1 < x < 1)
• tan(x) = x + x^3/3 + x^5*2/15 + x^7*17/315 + x^9*62/2835 + ... \ (-π/2 < x < π/2 )
but a simpler formula (non-series) is given below:
• _Numerical Recipes in C: the art of scientific computing_ book http://www.nr.com/ free download: the C and Fortran versions of the book. or you can buy a printed version of the C, Fortran, or C++ version. Unfortunately, the source code cannot be redistributed ... but the editors are nice enough to point to freely redistributable numerical libraries Netlib http://www.netlib.org/ and SLATEC . [FIXME: ... support ?]

includes a chapter:
Evaluation of continued fractions http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c5-2.pdf gives this formula:

tan(x) = x / ( 1 - x^2 / ( 3 - x^2 / ( 5 - x^2 / ( 7 - x^2 / 9 .... ) ) ) )

which DAV mangles into

t2 = x^2
tan(x) = x / ( 1 + t2 / ( t2 / ( 5 + t2 / ( t2 / ( 9 .... ) - 7 ) ) - 3 ) )
= 9 \ t2 - 7 \ t2 + 5 \ t2 - 3 \ t2 + 1 \ x ; (estimate tangent of t)

or, in RPN,
t2 9 / 7 - t2 swap / 5 + t2 swap / 3 - t2 swap / 1 + t swap / ; (estimate tangent of t)

or
t2 9 / 7 + t2 swap / 5 - t2 swap / 3 + t2 swap / 1 - t swap / neg ; (estimate tangent of t)

and claims that ``Continued fractions frequently converge more rapidly than power series expansions, and in a much larger domain in the complex plain ...''. and gives hints on how far you have to go to get any desired accuracy. Also gives a pretty piece of C code that, given the coefficients of a polynomial P(x), and x, simultaneouly evaluates the polynomial P(x) and its derivative dP(x)/dx. [c_programming.html]

http://mathworld.wolfram.com/Tangent.html gives that same continued fraction formula for tangent, and also gives this equivalent continued fraction (can be derived from the above by multiplying above and below each fraction bar by 1/x):

```	tan(x) = 1 / ( (1/x) - 1 / ( (3/x) - 1 / ( (5/x) - 1 / ( (7/x) - ... ) ) ) )
= the continued fraction [0, 1/x, 3/x, 5/x, 7/x, ...]
```

which DAV mangles into

z = 1/x

...

y = 1 / ( z - 1 / ( 3*z - 1 / ( 5*z - 1 / ( 7*z - ... ) ) ) ) ; estimate y = tan(x)
= 9*z \ 1 - 7*z \ 1 + 5*z \ 1 - 3*z \ 1 + z \ 1

or, in RPN,

9 z * INV NEG 7 z * + INV NEG 5 z * + INV NEG 3 z * + INV NEG z + INV

or

9 z * INV 7 z * - INV 5 z * + INV 3 z * - INV z + INV ;

• cosine: http://mathworld.wolfram.com/Cosine.html mentions Hardy's formula (G. H. Hardy 1959):

For 0 <= x <= π/2,

cos( π*x / 2) =~= 1 - x^2 / ( x + (1-x)*sqrt( (2-x)/3 ) )

with a worst-case error of about 0.003 in that range.

#### arctan

ways to compute arctan, a trig function

There are a few more methods over at #trig .

• Arctan(x) = x - x^3 + x^5/5 - x^7/7 + ...
• [FIXME: how does that arctan formula in http://home.earthlink.net/~mrob/pub/math/numbers-5.html work ?]
• algorithms to calculate atan() http://www.math.niu.edu/~rusin/known-math/99/arctan
• arctan() on a 8 bit Microchip PIC

```From: Ron < cyberron_1998 at hotmail.com >
Newsgroups: comp.arch.embedded.piclist
Subject: Arctan math function
Date: 20 Feb 2003 14:03:14 -0800

Hello erveryone,

I would if someone have any how to implement a function to compute the
arctan of variable x who is a 16-bit integer ?? I want to make this
function to fit on my PIC16F877A from Microchip.  I can probably do
this with Taylor series, Tchebytchev or with CORDIC but I want to
which is best,fast and more accurate than the other.  I have found
this code but it's for 8-bit only:

"""
;----------------------------------------------------------
;
;arctan (as adapted from the similar arcsin function)
;
;  The purpose of this routine is to take the arctan of an
;8-bit number that ranges from 0 < x < 255/256. In other
;words, the input, x, is an unsigned fraction whose implicit
;divisor is 256.
;  The output is in a conveniently awkward format of binary
;to pi/4 for the normal arctan function. Specifically, this
;algorithm computes:
;
; arctan(x) = real_arctan(x/256) * 256 / (pi/4)
;  for 0 <= x <= 255
;
;  where, real_arctan returns the real arctan of its argument
;
;  The algorithm is a table look-up algorithm plus first order
;linear interpolation. The psuedo code is:
;
;unsigned char arctan(unsigned char x)
;{
;  unsigned char i;
;
;  i = x >> 4;
;  return(arctan[i] + ((arctan[i+1] - arctan[i]) * (x & 0xf))/16);
;}
;
;
arctan
SWAPF   x,W
ANDLW   0xf
MOVWF   temp                    ;Temporarily store the index
CALL    arc_tan_table           ;Get a2=atan( (x>>4) + 1)
MOVWF   result                  ;Store temporarily in result
DECF    temp,W                  ;Get the saved index
CALL    arc_tan_table           ;Get a1=atan( (x>>4) )
SUBWF   result,W                ;W=a2-a1, This is always
positive.
SUBWF   result,F                ;a1 = a1 - (a1-W) = W
CLRF    temp                    ;Clear the product
CLRC
BTFSC   x,0
RRF     temp,F
CLRC
BTFSC   x,1
RRF     temp,F
CLRC
BTFSC   x,2
RRF     temp,F
CLRC
BTFSC   x,3
RRF     temp,W
RETURN
arc_tan_table
RETLW   0
RETLW   20     ;atan(1/16) = 3.576deg * 256/45
RETLW   41
RETLW   60
RETLW   80
RETLW   99
RETLW   117
RETLW   134
RETLW   151
RETLW   167
RETLW   182
RETLW   196
RETLW   210
RETLW   222
RETLW   234
RETLW   245
RETLW   0       ;atan(32/32) = 45deg * 256/45

The other part of the problem is implementing the division

FRAC_DIV:
;-------------------
;Fractional division
;
; Given x,y this routine finds:
;  a = 256 * y / x
;

movlw  8    ;number of bits in the result
movwf  cnt
clrf   a    ; the result
movf   x,w

L1:

clrc
rlf    y,f   ;if msb of y is set we know x<y
rlf    a,f   ;and that the lsb of 'a' should be set
subwf  y,f   ;But we still need to subtract the
;divisor from the dividend just in
;case y is less than 256.
skpnc        ;If y>x, but y<256
bsf   a,0   ; we still need to set a:0

btfss  a,0   ;If y<x then we shouldn't have

decfsz cnt,f
goto  L1

return

It's easy enough to combine these two routines to obtain a 4-quadrant
arctan(y/x) routine. However, you do need to keep in mind that the
arctangent routine posted above is only valid over 1/8th of the unit
circle. To obtain the other 7/8th's you'll need to apply the
appropriate trig identities.
"""

```

## integer math libraries

• Motorola application note AN1219: M68HC08 Integer Math Routines,
• [???] HC08EXSW HC08 Software Example: Library containing software examples in assembly for 68HC08 http://e-www.motorola.com/collateral/HC08EXSW.zip [FIXME: does this really have math libraries, or something completely different ? move to Motorola ?]

## correlation

[FIXME: I had a brief description of the difference between simple correlation, normalized correlation, and the dot product ... Put it online here. Or, if the Mathworld description is better, just point there. ]

• ``Fast Normalized Cross-Correlation'' paper by J. P. Lewis http://www.idiom.com/~zilla/Work/nvisionInterface/nip.html

``Although it is well known that cross correlation can be efficiently implemented in the transform domain, the normalized form of cross correlation preferred for feature matching applications does not have a simple frequency domain expression.

... The algorithm described in this paper was developed for the movie Forest Gump (1994) ''

then describes one technique for efficiently calculating the normalized cross correlation. (also briefly describes ``Phase Correlation'')

## continued fractions

[FIXME: gather stuff I have scattered around here on this topic ...]

• (news and discussion) http://c2.com/cgi/wiki?ContinuedFractions
• http://numbers.computation.free.fr/Constants/constants.html tries to explain continued fractions ...

For any simple continued fraction of a number x,
x = [a0; a1, a2, a3, ... ak, ...]

(paraphrased by DAV) If you only want to compute 1 convergent, pk/qk, approximating a number, simply start at the kth item in the continued fraction and use the definition of continued fraction expansion to compute from right to left. If you want to compute many convergents, or you want to keep calculating convergents until the approximation is ``good enough'', you can compute from left to right:

p(-1) = 1, p0 = a0
q(-1) = 0, q0 = 1

pk = ak*p(k-1) + p(k-2)
qk = ak*q(k-1) + q(k-2)

lim(k --> +inf)( pk/qk ) = x

[FIXME: is this right ?]

http://www.bloodworth.org | mirror http://members.tripod.com/careybloodworth/contfrac.htm gives a much better explaination of evaluating continued fractions from left to right. [FIXME: ... could I help Carey Bloodworth complete this page ? ]

• HAKMEM has a section on continued fraction arithmetic and (in ITEM 97 (Schroeppel)) mentioned that ``sqrt(7) is hairy''.

## HAKMEM

HAKMEM

lots of fun mathematical tips and tricks ... mixed in with obsolete information. [FIXME: link to "processor agnostic"] [FIXME: fix links from other pages to #hakmem]

## cubic curves (Bezier, etc.)

```

[1d design ... or Visual:graphics_algorithms]
Bezier splines (cubic curves)
t ranges from 0 to +1.
x(t) = a3*t^3 + a2*t^2 + a1*t + a0
y(t) = b3*t^3 + b2*t^2 + b1*t + b0
u ranges from -1 to +1 ("symmetric cubic spline").
x(u) = c3*u^3 + c2*u^2 + c1*u + c0
y(u) = d3*u^3 + d2*u^2 + d1*u + d0
(so dx(u)/du = x'(u) = 3*c3*u^2 + 2*c2*u + c1).

The curve's initial direction from point x0,y0
starts heading directly towards the first "handle" x1,y1 ... and final direction as it hits the last point
x3,y3 is directly along the line from the second "handle" x2,y2 to x3,y3.
In equations,
we want endpoints x0, x3:
x(-1) = x0
x(+1) = x3
and we want the curve from x0 to initially head directly to x1 (in 2d), and the final directly as it hits x3 to come
from x2 (see above):
x'(-1) = ( x1 - x0 )*3/2 = K*(x7-x0)*3/2
x'(+1) = ( x3 - x2 )*3/2 = K*(x3-x8)*3/2
It seems I could *arbitrarily* choose some constant K ...
or in other words, I can *arbitrarily* pick guide points along the line
x7 = x0 + (x1-x0)/K.
(where x7 is my "new" guide point, x1 is PostScript / Lancaster's "traditional" guide point).
(which implies
x1 = K*x7 + (1 - K)*x0
x2 = K*x8 + (1 - k)*x3
).
Some possible ways to pick K:
* pick K so that the midpoint of the curve (t=1/2, u=0) is centered on the line between the new guide points. (I think
this makes recursive bezier drawing a bit faster).
* pick K so that the slope at the midpoint (c1) is exactly the same as the slope between x7 and x8. (I think this makes
recursive bezier drawing a bit faster).
* pick K so that when guide points coincide, I get a parabola.
* pick K to make math "simple" ...
* pick K "small enough" so the curve lives inside the (convex hull) bounding box of the 4 points. (If K is too large,
x7 is so close to x0 that the curve pokes outside the x0,x7,x8,x3 bounding box).
* ... ?

Choosing K:

....
If we then add the constraint that
force x(u=0) = c0 to equal (x8 + x7)/2,
then we find
c0 = ( x3 + 3*x2 + 3*x1 + x0 )/8
= ( x3 + 3*(K*x8 + (1 - k)*x3) + 3*(K*x7 + (1 - K)*x0) + x0 )/8
= ( (4-3*K)*x3 + 3*K*x8 + 3*K*x7 + (4 - 3*K)*x0 )/8
...
so we force (4-3*K) = 0 by setting
K = 4/3
so now
c0 = ( x8 + x7 ) / 2.
This curve is not contained in the (convex hull) bounding box of x0,x7,x8,x3. But maybe that's OK.

Choosing K:

If choose a different constraint,
the constraint that the *slope* at the midpoint
depends only on the guide points x7 and x8 (i.e., the curve, at the midpoint,
is exactly tangent to the line between the guide points ...
the curve midpoint touches that line, but often *not*
exactly centered between x7 and x8), then
dx/du at (u=0) = c1 = ( x3 +   x2 -   x1 - x0 )*3/8 ; slope at midpoint
....
K = 2
x7 = x0 + (x1-x0)/K.
This seems to imply that if those x7 and x8 guide points *touch*,
i.e., if the traditional guide points make a perfect rectangle,
x2   x1

x0   x3
, then there will be "no" slope at the midpoint,
i.e., that's how to make a singularity (cusp) at the midpoint.

Choosing K:
I think
that the *closest* the bezier curve *ever* gets to the line
between x1 and x2
is 1/4 the distance between x0 and x1,
or 1/4 the distance between x2 and x3, whichever is smaller,
assuming that x0 and x3 are both on the same side of the x1-x2 line.
So I could choose
K=4/3
x7 = x0 + (x1-x0)/K.
and the bezeir curve would still always fall
in the x0,x7,x8,x3 bounding box. I think.

What are the max and min values of x(u) ?
Since it is a polynomial, it is *either*
* at the endpoints of the curve, *or*
* where x'(u) = 3*c3*u^2 + 2*c2*u + c1 = 0,
The second case occurs
where (if 0 == c3) u = -c1/(2*c2), or
where (if 0 != c3) u = ( -2*c2 +- ( (2*c2)^2 - 4*3*c3*c1 )^1/2 ) / (6*c3),
and -1 < u < +1.

t equation space to graph space:
x0 = a0
x1 = a0 + a1/3
midpoint = a3/8 + a2/4 + a1/2 + a0
x2 = a0 + a1*2/3 + a2/3
x3 = a0 + a1 + a2 + a3.

u equation space to graph space:
x0 = -c3 + c2 -   c1   + c0
x1 =  c3 - c2/3 - c1/3 + c0
midpoint = c0
x2 = ...
x3 =  c3 + c2 +   c1   + c0

graph space to t equation space:
a0 = x0  ; starting point
a1 = 3*( x1 - x0 ) ; initial slope. clearly this must be some K*( x1 - x0 ) to get the proper initial trajectory, but
why K=3 ?
a2 = (x2 - 2*x1 + x0)*3 ; initial curvature
a3 = x3 - 3*x2 + 3*x1 - x0.

graph space to u equation space:
c0 = ( x3 + 3*x2 + 3*x1 + x0 )/8 ; midpoint  = (x3+x0)/2 - a2
c1 = ( x3 +   x2 -   x1 - x0 )*3/8 ; curvature at midpoint = (x3-x0)/2 - a3
c2 = ( x3 -   x2 -   x1 + x0 )*3/8
c3 = ( x3 - 3*x2 + 3*x1 - x0 )/8

u equation space to t equation space:
t = (u+1)/2
a3 =   8*c3
a2 = -12*c3 + 4*c2
a1 =   6*c3 - 4*c2 + 2*c1
a0 =    -c3 +   c2 -   c1 + c0

t equation space to u equation space:
u = 2*t - 1
c3 = a3/8
c2 = a3*3/8 + a2*1/4
c1 = a3*3/8 + a2*1/2 + a1*1/2 ; slope at midpoint
c0 = a3*1/8 + a2*1/4 + a1*1/2 + a0 ; location of midpoint

In terms of the control points,
x(t) = (x3 - 3*x2 + 3*x1 - x0)*t^3 + (3*x2 - 6*x1 + 3*x0)*t^2 + (3*x1 - 3*x0)*t + x0

By inspection, we get a perfect "vertical" parabola by making
x = a1*t + a0
y = b2*t^2 + b1*t + b0
i.e., we force all the high-order terms to zero.
In the case of y, we force the cubic term ( b3) to zero: = 0 = y3 - 3*y2 + 3*y1 - y0, or in other words,
the length
y3-y0 is exactly three times the length of y2-y1.
To force the parabolic term (a2) to zero: 0 = x2 - 2*x1 + x0,
or in other words,
the length x2-x1 exactly equals the length x1-x0.
To force the parabolic term (c2) to zero: 0 = x3 -   x2 -   x1 + x0,
the length x1-x0 exactly equals the length x3-x2.
In the case of x, we set them both to zero, or in other words,
x3, x2, x1, x0 are precisely evenly spaced apart.
Clearly every point of this "bezier" curve exactly touches the parabola -- but
is the reverse true: if we force the t=0, t=1/3, t=2/3, t=1 points to go through
randomly selected points on some parabola
(even if they are not precisely evenly spaced apart in the x direction), will
that bezier curve still be parabolic ?
I'm pretty sure that if we take 4 randomly selected points on some straight line
(not necessarily horizontal or vertical), the bezier curve *will* stay on that straight line.

Another form of the same equation:
x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*(...) = t*x3 + (1-t)*( t*(...) + x0 )
x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*( a3*t + U4 ) = t*x3 + (1-t)*( t*( a3*t + U4 ) + x0 )
This has exactly the same number of multiplies ... might even be faster
on architectures where t*a + (1-t)*b ("twist") is faster than 2 independent multiplies.

x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*(p(t)) = t*x3 + (1-t)*( t*(p(t)) + x0 )
forces the curve to hit the endpoints, while p(t) gives the shape of the curve between the endpoints.
various expressions for p(t):
k3*t + k2 ...
k3*(t-1/2) + k2 : ~height of midpoint (from the x0-x3 line) and slope at midpoint
k3*t + (1-t)*k2 : ~slopes at t=0 and t=1.

At the endpoints of the curve,
x(t=0) = x(u=-1) = x0
x(t=1) = x(u=+1) = x3.
dx/dt at (t=0) = a1 = ( x1 - x0 )*3
dx/dt at (t=1) =      ( x3 - x2 )*3
dx/du at (u=-1) = 3*c3-2*c2+c1 = ( x1 - x0 )*3/2
dx/du at (u=+1) = 3*c3+2*c2+c1 = ( x3 - x2 )*3/2

At the "midpoint" of the curve, where t=1/2, u=0,
x(t=1/2) = x(u=0) = c0 = ( x3 + 3*x2 + 3*x1 + x0)/8.
dx/dt at (t=1/2) = ( x3 + x2 - x1 - x0 )*3/4  (slope at midpoint)
dx/du at (u=0) = c1 = ( x3 +   x2 -   x1 - x0 )*3/8  (slope at midpoint)

(At every point in the curve,
the slope in term of u is only half the slope in terms of t, because it must climb the
entire curve in only half the time (1 unit) with the t formulation,
compared to the (2 units) u formulation).

We get a "cusp" when dx/dt = 0 *and* dy/dt = 0.
Most splines do not have a cusp.
The cusp may be located anywhere along the spline, not just at the midpoint.

The 1/3 points:
x(t=1/3) = x(u=-1/3) = ( x3 + 6*x2 + 12*x1 + 8*x0 )/27
dx/dt at (t=1/3) = ( x3 + 3*x2 + 0*x1 - 4*x0 )/3 = (x3-x0)/3 + (x2-x0).
x(t=2/3) = x(u=+1/3) = ( 8*x3 + 12*x2 + 6*x1 + x0 )/27
dx/dt at (t=2/3) = ( 4*x3 + 0*x2 - 3*x1 - x0 )/3 = (x3-x0)/3 + (x3-x1).

Are the quarter-points interesting ?

to fully define a cubic curve (just the x(u) of the bezier curve) requires 4 numbers:
- a0, a1, a2, a3 of the formula (one endpoint and the velocity, acceleration, and jerk).
- endpoints x0, x1, and slope at endpoints (x1-x0), (x3-x2).
- go through 4 given points x0, x7, x8, x3.
- endpoints x0, x1, midpoint xm, and ... something else.

Is it possible to adjust K so that x4, x5 uniquely guide the curve,
*and* have the midpoint of the curve exactly hit (x4+x5)/2 ???

formula for length of the curve (or at least upper and lower bounds) ???

What "interesting" points lie on or near the Bezier curve ?

How to approximate circle/ellipse with bezier curves ? What is worst-case error ?
circles:
x^2 + y^2 = r^2
y = ( r^2 - x^2)^(1/2)
or
x = sin(A), y=cos(A).

How to approximate parabola with bezier curve ? (there is zero error ...
focus ... line ...)

How to approximate hyperbola with bezier curve ?

How to approximate exponential / logarithm curves with bezier ?

To approximate *any* curve given only the points ... given the points and slopes ...

How to recursively draw bezier ???

To force bezier *through* a bunch of points,
collect in groups of 4:
0th curve goes through x0, x1, x2, x3
1st curve goes through x3, x4, x5, x6
2nd curve goes through x6, x7, x8, x9
... etc.
(This method gives the odd effect of a cusp at x0, x3, x6, x9, etc. but smooth curves through the other points ...
is there a better way ?)
(yes -- see tinyboot article above)

To force a bezier *through* 4 points:
x(t=0) = x(u=-1) = x6
x(t=1/3) = x(u=-1/3) = x7
x(t=2/3) = x(u=+1/3) = x8
x(t=1) = x(u=+1) = x9
In terms of t:
a0 = x6
a1 = (-11*x6 + 18*x7 - 9*x8 + 2*x9 )/2
a2 = (  2*x6 -  5*x7 + 4*x8 -   x9 )*9/2
a3 = (   -x6 +  3*x7 - 3*x8 +   x9 )*9/2
In terms of u:
b3 = ( -x6 +  3*x7 -  3*x8 + x9 )*9/16
b2 = ( +x6 -    x7 -    x8 + x9 )*9/16
b1 = ( +x6 - 27*x7 + 27*x8 - x9 )/16
b0 = ( -x6 +  9*x7 +  9*x8 - x9 )/16

To approximate circle ... can't be done exactly.
1. break up curve into pieces.
2. approximate each piece with bezier, matching endpoints and slopes.
For any small piece of the circle (arc),
we can set the endpoints of the bezier to the endpoints of the arc,
and the guide points along a line tangent to the circle...
by symmetry, it seems best for both tangent lines to have the same length -- but how long ?
Some ways to pick a length:
* adjust so midpoint of bezier actually does intersect/touch/tangent the midpoint of the arc
* adjust so *curvature* at endpoints is correct (this is more-or-less the "smooth" option of the tinyboot cubic approximation to sin)
* adjust so *curvature* at midpoint is correct
* adjust so -1,-1/3,+1/3,+1 points of curve all touch the circle (is it possible to do this *and*
keep tangent at endpoints ?
Alas, no -- this is the "more accurate, less smooth" option of the tinyboot cubic approximation to sin)
....
adjust so midpoint of bezier actually does intersect/touch/tangent the midpoint of the arc:
to approximate upper-right quarter-circle with center of circle at 0,0,
use points
p0 = r,0
p1 = r, r*(4/3)( 3*2^(1/2) - 1 ) =~= r, r*(0.55228)
p2 = r*(4/3)( 3*2^(1/2) - 1 ), r
p3 = 0,r

to approximate bottom quarter-circle with center of circle at 0,0,
use points
p0 = -r / 2^(1/2), -r / 2^(1/2)
p1 =  , -r*( 4 - 1/2^(1/2) )/3 =~= , -r*(1.0976)
...

recursive spline drawing:
// draw spline ending at points p0 and p3, using guide points p1 and p2.
drawspline( p0, p1, p2, p3){
// this works for p0 in 1D, 2D, 3D, etc.

Are points "close enough together" ? then draw pixels / lines / parabolas.
(Checking just p0 and p3 isn't enough -- sometimes p0 and p3 are on the *same* pixel,
yet the guide points stretch the spline out into a big loop.)
... and return.

Optional: Are p0, p1, p2, p3 close enough to a straight line ? Then draw a straight line and return.

else:

Calculate new midpoint pm:
pm = p(t=1/2) = p(u=0) = ( p3 + 3*p2 + 3*p1 + p0 )/8.
/*  with different K, we could just do (p2 + p1)/2. */
Calculate offset of new guide points from new midpoint:
d = ( p3 + p2 - p1 - p0 )/8    /*  x'(t=1/2) = ( x3 + x2 - x1 - x0 )*3/4  */
calculate new guide point for right curve:
p_right = pm + d               /*  x'(t=0) = ( x1 - x0 )*3, so x1 = x0 + x'(t=0)/3   */
Calculate new guide point for left curve:
p_left = pm - d
recursion:
/* to limit recursion depth, do the *smallest* one first. */
drawspline( p0, (p0+p1)/2, p_left, pm);
drawspline( pm, p_right, (p2+p3)/2, p3);
}

(How does this compare to the
deCasteljau Algorithm
?
Cox de Boor recursive algorithm
?)

----
```

Part I: Splines http://www.cg.inf.ethz.ch/~pauly/CGC/PartI.pdf instead defines cubic curves this way:

... x(t) = b0*(1-t)^3 + 3*b1*t*(1-t)^2 + 3*b2*t^2*(1-t) + b3*t^3 Coefficients b0,...,bn are called Bezier points or control points. Set of control points defines the so-called control polygon.

Properties of Bezier-Curves:

* Affine invariance: Affine transform of all points of on the curve can be computed by the affine transform of its control points.

* Convex hull property: The curve lies in the convex hull of its control polygon. ...

Keeps mentioning a "Java Applet". Where is it ?

```----

"Explorations in
Geometric Modeling:
Parametric Curves and Surfaces"
by Kenny Hoff
COMP236 Final Project
Spring 1996
http://www.cs.unc.edu/~hoff/projects/comp236/curves/
...
includes
"Derivation of Incremental Forward-Difference Algorithm for Cubic Bezier Curves using the Taylor Series Expansion for
Polynomial Approximation",
"Derivation of 3D Rational Curves", and
"C++ Source Code"

(# Bezier curves

* Direct evaluation using Bernstein polynomials (with coefficient lookup tables)
* Recursive subdivision
* Forward-differencing (using matrix simplification)
)
(Cox de Boor recursive algorithm)

----

"De Boor's algorithm is a generalization of de Casteljau's algorithm.
It provides a fast and numerically stable way for
finding a point on a B-spline curve given a u in the domain."
http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/de-Boor.html

"B-spline curves share many important properties with Bézier curves, because the former is a generalization of the
later."
http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/bspline-curve-prop.html
http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/

```

## unsorted

• http://abbeyclock.com/antiqueclockrepair.html shows a bunch of curves related to clock gears:
• Gearing: The Cycloidal Curve
• Gearing: The Epicycloidal Curve
• Gearing: The Involute Curve
• Topics in Mathematics http://archives.math.utk.edu/topics/ nice search tool ... flagged by age level ...
• University of Toronto Mathematics Network: Question Corner and Discussion Area http://www.math.toronto.edu/mathnet/questionCorner/qc.html ``a place for high-school students and others to ask questions about mathematical topics and receive accurate and informative answers.''
• Angle Between Vertices of a Tetrahedron http://www.math.toronto.edu/mathnet/questionCorner/tetangle.html [FIXME: reread ...]
• Journal of Integer Sequences http://www.math.uwaterloo.ca/JIS/ [FIXME: read; email link to __ about binary patterns for small processors]
• Applications of the Geometric Mean \ http://www.math.toronto.edu/mathnet/questionCorner/geomean.html has a pretty simple explaination of the geometric mean and how it differs from (the geometric mean is always less than or equal to the arithmetic mean) the arithmetic mean.
• [logarithms] Natural Logs in the Real World http://www.math.toronto.edu/mathnet/questionCorner/logsinlife.html
• [FIXME: http://www.math.toronto.edu/mathnet/questionCorner/sqrootcalc.html Yet another way to calculate square roots. also arbitrary roots: x = a^(m/n). Put a summary on massmind.org, if that algorithm isn't already listed. Is this simply Newtonian approximation ?

How to Draw a Circle on a Computer http://www.math.toronto.edu/mathnet/questionCorner/drawcircle.html ]

• Why is e^(i*pi) = -1? http://www.math.toronto.edu/mathnet/questionCorner/epii.html explains the more-general question ``why is e^(i*x) = cos(x) + i*sin(x) the "right" thing to define what e raised to an imaginary power means ?'' (includes the infinite series for e^(x), sin(x), cos(x) )
• Mathematical Association of America http://www.maa.org/ sponsors the William Lowell Putnam Competition Mathematical Association of America and The Elizabeth Lowell Putnam Prize was established in 1992 to be "awarded periodically to a woman whose performance on the Competition has been deemed particularly meritorious". [FIXME:]
• ``Anyone Have Interesting Graphing Calculator Problems?'' -- David Lederman, "David Lederman's Math Club" http://www.math.toronto.edu/mathnet/questionCorner/graphcalc.html
• Classic Fallacies: All People in Canada are the Same Age http://www.math.toronto.edu/mathnet/falseProofs/fallacies.html
• Solving the Quintic http://www.wolfram-research.com/posters/quintic/

"the formula for finding the roots of any quadratic, cubic or quartic polynomial." http://forum.swarthmore.edu/dr.math/problems/rootsform.html

• What are the general solutions to cubic and quartic polynomial equations? http://forum.swarthmore.edu/dr.math/faq/faq.cubic.equations.html
• http://mathforum.org/
• ``A career goal for me would be to see discovery programs used by mathematicians as much as computer algebra packages. There is a long way to go. ... we would like to try HR in plane geometry'' -- Simon Colton, author of HR: Automatic Theory Formation in Pure Mathematics http://www.dai.ed.ac.uk/homes/simonco/research/hr/ [FIXME: to_program ?] [FIXME: check out the graph drawing tools here, under ``links'' -- anything useful for psd.html ? ]
• ``Adaptive Noise Removal from Biomedical Signals using Warped Polynomials'' Dr. Ir. W. Philips http://citeseer.nj.nec.com/69760.html ??? [FIXME: read]
• "Lies my calculator told me" http://www.math-atlas.org/99/calc_errors | mirror http://www.math.niu.edu/~rusin/known-math/99/calc_errors
• 2002-12-04:DAV: Today I was looking for values of sin( 2 * pi / n )^2 that are rational for integer values of n. (Since sin(x)^2 = ( 1 - cos( 2*x ) ) / 2, these are also the values of cos( 4 * pi / n ) that are rational for integer values of n).

Here are all the rational results I found so far -- plus a few of the irrational ones.
I suspect these are all the values of sin( 2 * pi / n )^2 that are rational for all possible rational values of 1 < n.
I suspect these are the only rational values of sin( 2 * pi / n )^2 for any rational value of n.

```  n  (sin( 2 * pi / n ))^2
1  0
2  0
3  3/4
4  1
5  (2 + phi)/4
6  3/4
7  228/373 + ...
8  1/2
9  69/167 + ...
10  (3 - phi)/4
12  1/4
15  22/133 + ...
16  (2 - sqrt( 2 )) / 4
20  (2 - phi)/4
24  (2 - sqrt( 3 )) / 4
```

where
phi = (1+sqrt(5))/2 = the golden ratio = [ 1; 1 1 1 1 1 1 1 ....]

-- David Cary

• Floating Point Emulation Package for ANS Forth by Brad Eckert http://www.tinyboot.com/float.txt ``basic floating point number support and functions such as +, -, *, / and SQRT.'' notes that ``If you need transcendental functions, they can often be done in integer arithmetic'' Uses 3 CELLS per floating point number (2 mantissa, 1 exponent). That's 6 bytes in common 16 bit FORTHs ... ... vaguely reminiscent of the 3 byte floating point format DAV has seen re-implemented by many assembly language programmers.
• Alternative Number Formats by Robert P. Munafo http://home.earthlink.net/~mrob/pub/math/altnum.html including ``Logarithmic Number System (LNS)'' [FIXME: email about factorial decimal expansion ... continued fraction notation ... and the funky system I mention in vlsi.html]
• approximation software:
• QPI by Andre Schoorl runs on the HP48. ``QPI approximates any floating point numbers by a rational number, square root, multiple of PI, exponential or a logarithm depending on which approximation seems best.'' (i.e., it searches for expressions of the form

a/b
-a/b
PI*a/b
-PI*a/b
(a/b)*sqrt(a/b)
-(a/b)*sqrt(a/b)
ln(a/b)
exp(a/b) // DAV thinks this form can represent all complex numbers
-exp(a/b)

where a,b,c,d are all positive integers. (works with complex numbers ?)
• You can type in *any* (floating-point) number, and ``RIES -- Find Algebraic Equations, Given Their Solution Copyright (C) 2000 Robert P. Munafo'' http://home.earthlink.net/~mrob/pub/ries/ will find many, many equations that give an approximation to that number (including all the rational approximations that you can find with continuing fractions, as well as many others). Complete source code (in C) available for download under GPL. [internally ... takes derivatives of equations in a FORTH-like syntax ...] [FIXME:]
• "The Joy of Math, or Fermat's Revenge" essay by Charles Krauthammer Time Magazine April 18, 1988 http://holyjoe.org/Krauthammer.htm

the romance that is mathematics. ... We think of a math whiz as someone who can do in his head what a calculator can do on silicon. But that is not math. That is accounting. Real math is not crunching numbers but contemplating them and the mystery of their connections. ...

... Mathematicians do not like to admit that, because when they do, ... they are suspected of just playing around, which of course they are.

... We have our means and ends reversed. What could be more important than divining the Absolute? ...

What higher calling can there be than searching for useless and beautiful truths? ...

... mathematics ... operates very close to religion, which is why numerology is important to so many faiths and why a sense of the transcendent is so keenly developed in many mathematicians, even the most irreligious. ... Says Erdös: “You don't have to believe in God, but you should believe in the Book.”

• Mathematical Constants and computation http://numbers.computation.free.fr/Constants/constants.html

This site is dedicated to mathematical, historical and algorithmic aspects of some classical mathematical constants (like π e, the Euler constant γ, ζ(3), ...). A few results on prime numbers are added. Easy and fast programs are also included and can be downloaded.

...

e = [2;1,2,1,1,4,1,1,6,1,1,8,1,1,10,1,1,12,1,1,14,...] (continued fraction expansion)
√e = [1;1,1,1,5,1,1,9,1,1,13,1,1,17,1,1,21,1,1,25,...],
(e-1)/2 = [0;1,6,10,14,18,22,26,30,34,38,42,46,50,...]

exp(z) = 1 + z/1! + z^2/2! + z^3/3! + ... z^n/n! + ... \ converges for -inf < z < +inf

for small values of x,
exp(x) =~= (12 + 6*x + x^2)/(12 - 6*x + x^2) =~= (2+x)/(2-x)

• square root

Newton's method: xnew = x - f(x)/f'(x).

for finding x=a^(1/2):
f(x) = x^2 - a.
f'(x) = 2*x.
xnew = x - (x^2 - a)/(2*x) = x - x/2 + a/(2*x) =
xnew = ( x + a/x ) / 2.

this is taking the average of x and a/x, and since either
x < a^(1/2) < a/x, or
a/x < a^(1/2) < x,
the average is closer ...

http://numbers.computation.free.fr/Constants/Sqrt2/sqrt2.html and http://numbers.computation.free.fr/Constants/Algorithms/inverse.html mentions an even simpler method (avoids division):

for finding x = a^(-1/2):
f(x) = x^(-2) - a
f'(x) = (-2)*x^(-3)
xnew = x - (x^(-2) - a) / ( (-2)*x^(-3) ) =
= x + (x^(-2) - a) * (x^3/2) =
= x + (x - a*x^3)/2 =
xnew = x + x*( 1 - a*x^2 )/2.

To find y = b^(1/2) using this method, either

• plug in a = 1/b, then calculate x(final) = a^(-1/2) = b^(1/2), or
• plug in a = b, calculate x(final) = b^(-1/2), then multiply by b to get y = b*x = b^1*b^(-1/2) = b^(1/2).
• division (inverse)

Newton's method: xnew = x - f(x)/f'(x).

for finding x = 1/a,
f(x) = 1/(a*x) - 1
f'(x) = (1/a)*(-1)*x^(-2) = -1/(a*x^2)
xnew = x - ( 1 / (a*x ) - 1 ) / ( -1/(a*x^2) ) =
= x + ( 1/(a*x) - 1 ) * ( a * x^2 ) =
= x + x - a*x^2 =
= 2*x - a*x^2 =
xnew = x + x*( 1 - a*x )

Normally Newton's method requires a division, but in this special case it only requires multiplication and addition. (no divides).

I've heard some early mainframe computers didn't have division built into the hardware, and used Newton's method to approximate it in software.

DAV: I remember deriving this in Calculus class ...

• 1.414213... = sqrt(2) = [1; 2 2 2 2 2 2 2...]
1.4167..... = 17/12

1.7320508... = sqrt(3) = [1; 1 2 1 2 1 2 1 2...]
1.732057.... = 362/209 1.73214..... = 97/56 1.7333...... = 26/15

• http://numbers.computation.free.fr/Constants/Sqrt2/sqrt2.html says that one way to approximate the square root of a rational number A/B is

p(n+1) = (A+B)*p(n) + 2*A*q(n)
q(n+1) = 2*B*p(n) + (A+B)*q(n)

then
lim(n --> inf) p(n)/q(n) = ( A / B ) ^ (1/2).

The closer A/B is to 1, the faster the speed of convergence.

DAV: range reduction: x^(1/2) = 2*(x/4)^(1/2) ...

(1+x)^(1/2) = 1 + x*1/2 - x^2*1/2*4 + x^3*1*3/2*4*6 - ...

• Arbitrary precision computation http://numbers.computation.free.fr/Constants/Algorithms/representation.html DAV: has a good description of the tradeoffs involved in developing a bignums package. Naturally this is also a nice brief description on how to handle ``reasonably sized'' numbers on machines that have inadequate word size (DAV has worked with many machines that can only handle bytes). [FIXME: gather all bignums packages I know about in one place]. For really huge integers (more than a few hundred digits), it discusses ways to speed up multiplication and points to ``FFT based multiplication of large numbers'' http://numbers.computation.free.fr/Constants/Algorithms/fft.html ``Multiplication of large numbers of n digits can be done in time O(nlog(n)) (instead of O(n2) with the classic algorithm) thanks to the Fast Fourier Transform (FFT).'' briefly name-drops NTT ( Number Theoretic Transform ) and Hartley transform ...
• http://numbers.computation.free.fr/Constants/Miscellaneous/classification.html explains the terminology integers, a subset of rational numbers, a subset of algebraic numbers, a subset of reals numbers. (non-integral rationals are called fractions.) (non-rational reals are called irrational). (non-algebraic reals are called transcendental). (Georg Cantor proved that nearly all real numbers are transcendental). mentions ``It is not known whether p+e, g (Euler's constant), B (Brun's constant), z(3), ... are transcendental. We don't even know if g,z(5) nor G (Catalan's constant) are irrational. ''
• Source code for low level binary functions collected by Jörg Arndt http://www.jjj.de/bitwizardry/bitwizardrypage.html includes
```	bitcount.h (population count: counting the bits in a word)
bit2pow.h (functions like floor(log2()))
bitrotate.h (rotating words)
revbin.h (reversing the order of bits in a word)
graycode.h (graycode and parity)
tinyfactors.h (determining whether d divides x for d,x<BITS_PER_LONG)
greencode.h (gray code towards high bits)
branchless.h (code that avoids branches)
```
and some code dealing with shift register sequences (LFSR). [FIXME: ... binary cleverness ... video_game.html ?] [FIXME: many of these in branchless are almost identical to tricks on PIC ...]
• [FIXME: move to BIGNUMS packages] http://www.jjj.de/hfloat/hfloatpage.html points to many bignums packages, and also the source code to ``The shortest programs for pi and e you have ever seen''
• ```From: Rob Warnock (rpw3@rigden.wpd.sgi.com)
Subject: Re: Sound compression
Newsgroups: comp.compression
Date: 1991-05-15 14:01:22 PST

In article <19524@crdgw1.crd.ge.com> davidsen@crdos1.UUCP
(bill davidsen) writes:
+---------------
|   Sorry I don't have source any more, but the way to quickly find the
| high one bit (on a 2's complement machine) is:
|
|   h1bit = n & (-n)
+---------------

Sorry, this is quite incorrect! This will find the *lowest*-order one bit set!

To find the *highest*-order bit set (a.k.a. a "priority encoder") you need a
find-highest-one function, or a loop, or a "comb". The find-highest function
exists in some CPU instruction sets (e.g., the PDP-10's "JFFO", the Am29000's
"CLZ") but not in others (e.g., the VAX's "FFS" finds the *lowest* bit).

Many software implementations use some version of a binary tree search:

bit_num_of_highest_one(unsigned int x)
{

if (x == 0)
return -1;
for (answer = 0, i = NBITSPERWORD/2; i; i >>= 1) {
x >>= i;
}
}

Often the loop is unrolled, since the wordlength is usually fixed for a given
machine. A co-worker (Brendan Eich) first showed me the "comb" approach:

bit_num_of_highest_one(unsigned int x)
{

if (x == 0)
return -1;
if (x & 0xffff0000)
answer += 16, x &= 0xffff0000;
if (x & 0xff00ff00)
answer += 8, x &= 0xff00ff00;
if (x & 0xf0f0f0f0)
answer += 4, x &= 0xf0f0f0f0;
if (x & 0xcccccccc)
answer += 2, x &= 0xcccccccc;
if (x & 0xaaaaaaaa)
}

While this is cute, it's not necessarily any faster than an unrolled version
of the "search" code above, since the 32-bit constants needed for the "comb"
version tend to take two instructions to generate on current RISCs.

By the way, there was a paper in CACM (??? sorry, forgot the ref) a number
of years ago which proved that there is no closed functional form without
conditionals which will compute "the highest-order one bit".

-Rob

-----
Rob Warnock, MS-1L/515		rpw3@sgi.com		rpw3@pei.com
Silicon Graphics, Inc.		(415)335-1673		Protocol Engines, Inc.
2011 N. Shoreline Blvd.
Mountain View, CA  94039-7311

...
the "leftmost bit" a.k.a. "binary logarithm" of an integer
...

From: Steve Peltz (peltz@cerl.uiuc.edu)
Subject: Re: Sound compression
Newsgroups: comp.compression
Date: 1991-05-22 16:24:19 PST

In article <105539@sgi.sgi.com> rpw3@sgi.com (Rob Warnock) writes:
>Let's re-open the question to others: What *other* fast, practical methods
>are there to find the "leftmost bit" a.k.a. "binary logarithm" of an integer,
>besides:

Here's what is used on our system (Cyber 60 bit word one's complement; I'm
translating to C). It requires a fast population count and shifting.

#define blur(i,o) d = d | ((d = d | ((o) >> i)) >> (2 * i))
#define hibit(x)  bitcnt(blur(16, blur(4, blur(1, d = x)))

This also obviously depends on order of operation; let's see if I can unroll it
a bit.

/* blur(1, d=x) */
d = x;
d |= d >> 1;
d |= d >> 2;
/* blur(4, ...) */
d |= d >> 4;
d |= d >> 8;
/* blur(16, ...) */
d |= d >> 16;
d |= d >> 32;
return bitcnt(d);

d is a signed 60-bit value, so right shifting does shift in the sign bit, but
that doesn't matter for the algorithm.

I don't know if there is another origin for this algorithm, but it is listed
in our library as having been written by Mike Huybenz, Feb '77.

For a 32 bit value, the last shift isn't necessary.
--
Steve Peltz
Internet: peltz@cerl.uiuc.edu	PLATO/NovaNET: peltz/s/cerl

From: Doug Gwyn (gwyn@smoke.brl.mil)
Subject: Re: Sound compression
Newsgroups: comp.compression
Date: 1991-05-16 11:57:46 PST

In article <104045@sgi.sgi.com> rpw3@sgi.com (Rob Warnock) writes:
>By the way, there was a paper in CACM (??? sorry, forgot the ref) a number
>of years ago which proved that there is no closed functional form without
>conditionals which will compute "the highest-order one bit".

And of course that "theorem" is wrong:  using the operand to index
into a table can immediately produce the appropriate answer.

For large word sizes or small machines, that will often be impractical,
but it can serve as a basis for an implementation using smaller tables
(which does require use of conditionals):

int ffo( register unsigned long x )
{
static int	bitnum =
{      -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7
};

if ( (x & 0xFF000000) != 0 )
return 24 + bitnum[x >> 24];
else if ( (x & 0x00FF0000) != 0 )
return 16 + bitnum[x >> 16];
else if ( (x & 0x0000FF00) != 0 )
return	8 + bitnum[x >>  8];
else
return	    bitnum[x      ];
}

From: William E Davidsen (davidsen@rdsunx.crd.GE.COM)
Subject: Re: Sound compression
Newsgroups: comp.compression
Date: 1991-05-17 10:21:56 PST

In article <104045@sgi.sgi.com>, rpw3@rigden.wpd.sgi.com (Rob Warnock) writes:
|> In article <19524@crdgw1.crd.ge.com> davidsen@crdos1.UUCP
|> (bill davidsen) writes:
|> +---------------
|> |   Sorry I don't have source any more, but the way to quickly find the
|> | high one bit (on a 2's complement machine) is:
|> |
|> |   h1bit = n & (-n)
|> +---------------
|>
|> Sorry, this is quite incorrect! This will find the *lowest*-order one
bit set!

You obviously read what I said, not what I meant ;-)

Yes, my note was woefully incorrect. It turned out that for the data I
had it was quicker to find the low bit and discard repeatedly than to
shift looking for the high. I should have said more or nothing.

There was a discussion of this in comp.lang.c some few years ago, and
for random data a binary search is faster. Sound data isn't random, at
least not in the programming sense.
```
• ```	One way to implement max(a,b) from (Tony Kennedy) Oct 1993:
max(x,limit) == max(x-limit,0) + limit
and if we define
"z > 0" as a word of all ones if z
is greater than zero and a word of all zeros if it is not, then
max(z,0) = z bitand (z > 0)

"z > 0" is a word of all ones if z
is greater than zero and a word of all zeros if it is not.

In C terms this leads to the following function
(for 32 long ints and floats with the sign bit at the left end)

float max(float x, float y) {
union fl {float f; long l;} z;
z.f = x - y;
z.l &= ~z.l >> 31;
return z.f + y;}

(Caveat: I have not directly checked the version given here).
```

[FIXME: convert to PIC assembly code -- or has someone already done this ? If so, point to that code.]

• http://forth.sourceforge.net/algorithm/ has some low-level algorithms: First Set Bit; power-of-2? CRC32; Fast Bit Counting; Square Root [SQRT]; ( data_compression.html#ecc )
• http://www.mathcats.com/ ???
• Wise old turtle (from Logo country) http://library.thinkquest.org/18222/?tqskip=1&tqskip1=1&tqtime=0203 ???
• Another clever snippet from http://www.pobox.com/~qed/optimize.html [FIXME: binary bit fiddling]

before

```	#define abs(x) (((x)>0)?(x):-(x))
```

after

```	static long abs(long x) {
long y;
y = x>>31;    /* Not portable */
return (x^y)-y;
}
...
static int int_abs(int a) {
return a - ((a+a) & (a>>31));
}
```
```	static inline ulong upos_abs_diff(ulong a, ulong b)
// Return abs(a-b)
// Both a and b must not have the most significant bit set
{
long d1 = b - a;
long d2 = (d1 & (d1>>(BITS_PER_LONG-1)))<<1;
return  d1 - d2; // == (b - d) - (a + d);
}
```
• [bignum] hfloat (for 'huge float') is a library package (coming as C++ code) for calculations with floating point numbers of extreme precision. It is optimised for computations with 1000 to several million digits. hfloat utilises FFT multiplication http://discuss.foresight.org/critmail/sci_nano/5914.html Maintained-by: Joerg Arndt http://www.jjj.de/
• ```sin0  = sqrt0/2 = 0
sin30 = sqrt1/2 = 1/2
sin45 = sqrt2/2
sin60 = sqrt3/2
sin90 = sqrt4/2 = 1
```
http://www.everything2.com/index.pl?node_id=916568
• Doubling Time and Population Growth http://geography.about.com/library/weekly/aa051800a.htm "the ...doubling time... is determined by dividing the growth rate into 70. The number 70 comes from the natural log of 2, which is .70." [FIXME: check math. Huh ? What about the "rule of 72" ?]
• is there a simple geometric proof that obviously shows that (3/4) * (5/8) == (5/4) * (3/8) ?
• Huxley quotes the admirably yclept philosopher C.D.Broad: "Each person is at each moment capable of remembering all that has ever happened to him and of perceiving everything that is happening everywhere in the universe. The function of the brain and nervous system is to protect us from being overwhelmed and confused by this mass of largely useless and irrelevant knowledge, by shutting out most of what we should otherwise perceive or remember at any moment, and leaving only that very small and special selection which is likely to be practically useful." http://www.lewiscarroll.org/bander.html#f8 [filter]
• DTACK GROUNDED, The Journal of Simple 68000 Systems Issue # 25 November 1983 Copyright Digital Acoustics, Inc http://www.amigau.com/68K/dg/dg25.htm Issue # 24 http://www.amigau.com/68K/dg/dg24.htm ... this magazine talks a lot about trig function implementations in 68000 assembly language; high-level languages (HLLs) vs. assembly languages; ... firmly of the opinion that assembly language is best for many things; and really likes the 68000 because it's easier to program in assembly than the 6502 or Intel 80286 ... doesn't like virtual memory or Unix ... [FIXME: where can I find other issues ?] [FIXME: copy to computer_architecture#compiler ?]
• "A root finder's roots" by Jack Crenshaw 2003-05-08 http://www.embedded.com/story/OEG20030508S0030 discusses "The World's Best Root Finder (TWBRF)" which uses "inverse parabolic interpolation" ( TWRBF interpolates an arbitrary function by a *horizontal* parabola; Mueller's method uses the usual vertical parabola. This article explains why.). [FIXME: read] includes source code.
• ...

If an angle is an integral multiple of one second, then its sine and cosine are obtainable by simple multiplication from the formulas

sin(a + Δa) = sin a cos Δa + cos a sin Δa
sin(a - Δa) = sin a cos Δa - cos a sin Δa

and

cos(a + Δa) = cos a cos Δa - sin a sin Δa
cos(a - Δa) = cos a cos Δa + sin a sin Δa

[DAV: these are exactly true for any 2 angles a and Δa]

If the angle also contains fractions of a second, one must first interpolate with respect to the fractional part ... and only afterwards combine ... in accordance with the relevant trigonometric addition theorem.

...

The tangent of any angle can be obtained easily by division, by use of one or the other of the two formulas

tan( β + Δβ ) = ( cos 2Δβ - cos 2β ) / ( sin 2β - sin 2&Deltaβ ) = ( sin 2β + sin 2Δβ ) / ( cos 2β + cos 2&Deltaβ )

and

tan( β - Δβ ) = ( cos 2Δβ - cos 2β ) / ( sin 2β + sin 2&Deltaβ ) = ( sin 2β - sin 2Δβ ) / ( cos 2β + cos 2&Deltaβ )

[DAV: Are these are exactly true for any 2 angles β and &Deltaβ ?]

.

For computing the logarithm of a trigonometric function to sixteen or more decimal places, the most suitable procedure is as follows. First, the value of the trigonometric function is computed as above. Next, a value as close to it as possible and with at least six places is determined and its twenty-one place logarithm read off from the tables of Steinhauser or Callet ... Finally, this last is used in conjunction with the series

log(n+Δn) = log n + 2 M { Δn/( 2n + Δn ) + (1/3)*( Δn / (2n + Δn) )^3 + .... }

to obtain the desired logarithm. The convergence of the series, in this application, is extremely rapid. M denotes the modulus of the Briggsian (common) logarithm ...

M = log_10( e ) =~= 0.43429_44819_03251_827651_13

1/M = log_e( 10 ) =~= 2.30258_50929_94045_684017_99

...

For the converse case -- to determine the angle, a trigonometric function of the angle being given -- ... Whatever the given function may be, the first step is to find as good an approximation to the desired angle as possible with the aid of seven-place... tables of logarithms. If y is this approximation and if y + Δy is the desired angle, then Δy is to be determined from the relations

Δy = ( sin(y+Δy) - sin y ) / ( arci*( cos y - (1/2)Δy*arci*(sin y) )

Δy = ( cos y - cos(y+Δy) ) / ( arci*( sin y + (1/2)Δy*arci*(cos y) )

Δy = ( tan(y+Δy) - tan y ) / ( arci*( 1 + tan(y)*tan(y+Δy) )

Δy = ( cot y - cot(y+Δy) ) / ( arci*( 1 + cot(y)*cot(y+Δy) )

[DAV: arci = 0°0'1" * π/180 = π/(180 * 60 * 60) =~= 0.00000_48481_36811_095359_94, which is the conversion factor between radians and *seconds* of angle. ]

[DAV: Δy is in units of *seconds* of a degree of angle.]

[DAV: these are not exact equivalences, but an approximation for when Δy is small]

where the appropriate relation is to be chosen, depending on whether the given number, from which y + Δy is to be determined, was sin(y + Δy), cos(y + Δy), tan(y + Δy), or cot(y + Δy). Since the unknown Δy occurs in the denominators of the right-hand sides of the first two relations, Δy can be determined from these only by successive approximations; the term involving Δy in the denominator is taken to be zero to begin with, and a first approximation Δy_1 is computed under this assumtion; the equation is then solved with this value substituted in the denominator; and so forth, until the approximations remain stable.

...

For Table II (Twenty-One Place Tables of Sine and Cosine), all 540 entries were checked not only by means of the formulas

sin a = cos( 30° - a ) - cos( 30° + a )

but also, to 24 places, by means of differences up to and including those of order 9. ... When rounding off to 21 decimal places, I undertook fresh, and completely independent, computations of all those values that end in 499, 500, or 501 in the 24th decimal place, thus insuring the accuracy to within half a unit in the 21st decimal place of all entries in the two tables.

...

-- _Eight-place tables of trigonometric functions for every second of arc_ book by Jean Peters 1939 (William F. Cary bought a 1968 printing in 1972 Sept. 5 for \$27.50)

DAV: since the slope = rise/run = derivative,

then the difference (rise) between 2 consecutive entries in the table should be the derivative times the run.

In other words, given a table of values

x | sin(x) | 1st difference | 2nd difference | 3rd difference
0
Δx
2*Δx
...

(Peters prints the difference in-between the 2 numbers that were subtracted, so the 1st difference is offset ... the 2nd difference is lined up with x and sin(x) ... the 3rd difference is offset, but lined up with the 1st difference).

then the 1st difference should be about

(d/dx)*sin(x)*Δx = cos(x)*Δx

and the 2nd difference should be about

(d/dx)^2*sin(x)*(Δx)^2 = -sin(x)*(Δx)^2

... If the table had been printed so that Delta was easy to read off (say we pick Delta of 100 (decimal) micro-radians, or print the table in hex and pick a Delta of 0.01(hex) radians, so there's 402 (dec) entries to cover the first quadrant (from 0 to π/2). ) then the digits would be *exactly* the same (just the decimal point shifted) in the 2nd difference column as in the original column for both the sin() and cos(); and the 1st difference column of the sin would be the same as the cos() column.

• ...

A Web- and CD-based successor to A&S, the Digital Library of Mathematical Functions, is under development at the National Institute for Standards and Technology, see http://dlmf.nist.gov/

Much of A&S is devoted to tables that we don't need anymore, except to check the results computed by MATLAB and other mathematical software. But the algorithms we use in modern software for special mathematical functions are derived from techniques used in hand computation over 50 years ago. There are too few people today who know how to write robust software for computing these functions.

...

To appreciate the subtleties involved in the computation of special functions, consider a more basic and familiar function, sin(x). ...

The resulting algorithm for approximating sin(x), as well as other trig functions, exp(x) and log(x), is carefully implemented in fdlibm, a Freely Distributable Math Library, developed at Sun Microsystems by K. C. Ng and others, (see www.netlib.org/fdlibm ). We use fdlibm in MATLAB for all the machines we support. This ensures uniform behavior, particularly for exceptional conditions.

...

This approach to computing sin(x)... carries over to other mathematical functions. The general outline is:

1. Transform the argument to one or more intervals where the function can be accurately approximated.
2. Evaluate a polynomial or rational approximation for the transformed argument.
3. If necessary, adjust the result to account for the initial argument transformation.

...

The algorithms that MATLAB uses for gamma functions, Bessel functions, error functions, Airy functions, and the like are based on Fortran codes written 20 or 30 years ago by W. J. Cody at Argonne National Labs and Don Amos of Sandia National Labs.

...

-- "The Tetragamma Function and Numerical Craftsmanship: MATLAB's special mathematical functions rely on skills from another era" article by Cleve Moler http://www.mathworks.com/company/newsletter/clevescorner/winter02_cleve.shtml ... discusses "argument reduction" and why it is necessary ... giving sin(), exp(), and ln() as examples.
• what DAV calls "binary bit fiddling", Don Lancaster calls "insider snippets." [FIXME: don't I have a section for this ?]

... each and every micro family has its insider snippets. Short and sneaky code sequences that do utterly amazing things. In ways previously unthunk of. As another insider snippet, we looked at a PIC generating a high quality sinewave in an astonishing six bytes of code back in HACK85.PDF.

• "General Guide to Implement Logarithmic and Exponential Operations on a Fixed-Point DSP" by Jason Jiang http://focus.ti.com/lit/an/spra619/spra619.pdf [#log; #exp]
• Famous Curves Applet Index ... experiment with ... associated curves. (Your browser must be capable of running Java) http://www-groups.dcs.st-and.ac.uk/~history/Java/
• Archaic Math http://ugcs.caltech.edu/~shulman/pub/writings/archaic/ claims that the common base-10 representation is the worst possible kinds of base-10 representation (S-10 is better because it can represent negative numbers without the "-" notation and makes addition and subtraction easier -- avoiding most carries and subtracts).
• n! =~= sqrt(2*pi)*n^(n+1/2)*exp(-n) (Stirling's formula)
• Number Patterns Fun with Curves & Topology http://ccins.camosun.bc.ca/~jbritton/jbfunpatt.htm "Clock (Modular) Arithmetic Pages" ... Java animations, QuickTime animations ... "NASA-designed classroom activity in which students learn about digital images and how satellites send information and pictures to earth using the binary system." ... Pascal's Triangle ... Conic Sections Animation ... "The Area of a Circle" ... interactive Lissajous Patterns ... Movie clip of the Tacoma Bridge Disaster ... fractals ... points to "The Legend of the Chess Board" http://ccins.camosun.bc.ca/~jbritton/jbchessgrain.htm ... points to "... Symbolic Secrets contains examples of popular symbols and corporate logos, and how they can be constructed. Symbols range from the Star of David and the Yin-Yang symbol, to the triangular construct of the Mitsubishi corporation and the CBS EYE." by Chris Hardaker http://earthmeasure.com/Designs/
• "Platonic Realms" http://mathacademy.com/pr/ math humor, math quotes,
• integral of e^x from -∞ to 0 is 1. In fact, integral of e^x from -∞ to n is e^n.
• spherical geometry
```     H     /|
/   |  Op
./______|
T
A

```

Given a right triangle (one right angle, another angle T) with 3 sides (hypotenuse H opposite the right triangle, opposite side Op opposite angle T, and adjacent side A between T and the right angle). Then we have:

cos T = A / H (plane geometry)

cos T = sin( A/r ) / sin( H/r ) (spherical geometry, where r = radius of earth)

Today 2004-03-20 I derived this from vector calculus ... In particular, if A is the length from the equator to the tropic of cancer ( A/r =~= 24° ), and T is the current latitude (for Tulsa, T =~= 36°), then at the top of a spherical or cyndrilical sundial, to completely catch the full range of sun positions at sunset, we need angle H/r from directly East (sunset at equinox) north-ish to sunset at winter solstice, angle H/r from directly East south-ish to sunset at summer solstice. (and symmetrically the same at sunrise on the W side of the sundial).

that angle H/r is

H/r = arcsin( sin( A/r) / cos T ).

In Tulsa, that angle is H/r =~= arcsin( sin( 24° ) / cos( 36° ) =~= 30.2°.

What an astonishing coincidence that it's almost exactly 1/6 π. So I can cut the southern 1/3 out of my cyndrilical sundial; the sun never casts a shadow in that direction (in Tulsa). [FIXME: sundial]

• Polynomial Fitting http://library.wolfram.com/webMathematica/Mathematics/Fitting.jsp online: fit data to linear, quadratic, or cubic equation.
• (some pretty curves and their mathematical formulas) http://www.2dcurves.com/sextic/sexticd.html
• "Ben and 2 Divides" http://stereopsis.com/2div.html clever method of doing 2 more-or-less independent divides using 1 divide and 5 multiplies. (Herf found that a Pentium 2 in single precision mode can do the divide and 5 multiplies in 18 cycles, nearly twice as fast as compared to simply doing 2 divides in 34 cycles, with a loss of about 1.5 bits of precision).

```Suppose you want to compute:

a/b    AND    c/d

In the real number system, this is the same as:

In case this isn't getting through, that's:

inv := 1 / (b * d)
inv * a * d    AND    inv * b * d
```
(Herf gives an example of point-by-point dividing 2 arrays c[] := a[] ./ b[] mentions that in double-precision, it's even better to do 3 at a time)
• "Computer Math 101" by Ben Weiss 2001-08 http://stereopsis.com/computermath101/ has some clever tricks for rapidly calculating :

... Normalizing a vector invariably requires dividing by the square root of the magnitude, which is equivalent to multiplying by the inverse sqrt: y = inverse_sqrt( x ) = 1/(sqrt(x)) = x^(-1/2) . With that in mind, here is some fast code to compute the inverse square root ... function that returns the sine and cosine of a single value at once. ... sincos(theta) ... In a nutshell, the most useful application for the arctangent function is to convert between rectangular and polar coordinates. To be more specific, starting with the location (x, y), we want to find (theta, r) in polar coordinates. ... cycles are being thrown away here. The calculation of the arctangent automatically generates the square root as a by-product, so why not use it? atan2r() ... given (x,y), calculates (theta, r) all at once.

• "Radix Tricks" by Michael Herf December 2001 http://stereopsis.com/radix.html very fast sorting of a list of floating point numbers, by (1) using a bijective mapping to convert 32 bit floating point numbers into 32 bit integers that sort "correctly", and (b) radix sort.
• "strcmp4humans" 2002-03 http://stereopsis.com/strcmp4humans.html a compare routine that lets you sort file names the "right" way, a.jpg A.jpg z.jpg Z.jpg 1.jpg 2.jpg 3.jpg ... 9.jpg 10.jpg 11.jpg
```1 	2 	3 	A 	697 Hz
4 	5 	6 	B 	770 Hz
7 	8 	9 	C 	852 Hz
* 	0 	# 	D 	941 Hz
|	|	|	|
|	|	|	1633 Hz
|	|	1477 Hz
|	1336 Hz
1209 Hz
```
-- http://en.wikipedia.org/wiki/DTMF
• The Microchip PIC16F688 has little non-contiguous blocks of RAM.
16 bytes: 0x70 to 0x7F: "shared" in all banks.
80 bytes: 0x20 to 0x6F: bank 0
80 bytes: 0xA0 to 0xEF: bank 1
80 bytes: 0x120 to 0x16F: bank 2
(80 bytes: 0x1A0 to 0x1AF: unimplemented, read as "0" in PIC16F688 ... implemented as bank 3 in PIC16F88 and PIC16F877 and PIC16F876).

In each bank, we have
0x00 ... : SFRs
0x10 ... : SFRs
0x20 ... : RAM
0x30 ... : RAM
0x40 ... : RAM
0x50 ... : RAM
0x60 ... : RAM
0x70 ... 0x7F : shared RAM

The biggest contiguously-addressible block is 96 bytes.

```; offset of byte (0..95) in array is in "offset".
; Set up FSR so we can read and write that byte through INDF.
; array runs from array_start to array_start+95 in a single bank.
; array_start must be at least 0x20 to avoid hitting SFRs.
; maximum contiguous array starts at either 0x20, 0xA0, or 0x120.
array_start EQU 0x120
; set up bank select bit IRP
BANKISEL array_start
; copy offset, into W
movf offset,w
; tweak to keep it in general RAM, avoid pointing to SFRs
addlw array_start ; if offset is too big, might overflow buffer into SRF -- oops!
movwf FSR
; now ready to read or write the byte at that offset using INDF.
; 4 instructions, 4 cycles ...
```

What if we want an even bigger array?

With a little bit of bit-twiddling, we can split up an array into all 3 banks. The largest "simple to address" RAM array seems to be 3*64 bytes = 192 bytes (on chips that implement "bank 3", such as the PIC16F88, this method gives 4*64 = 256 bytes).

Given a byte index the quickest way I know of to convert that index into a pointer and read from it is: (byte index goes from 0 to 255 on PIC16F88 and PIC16F877 and PIC16F876 ... on PIC16F688, byte index 0 to 192 ( 0 to 0xBF ) access real RAM, the rest access "unimplemented" and read zero).

```; offset of byte (0.. 255) (actually 0..191) in array is in "offset".
; Set up FSR so we can read and write that byte through INDF.
; (The array is stored in banks 0,2,1,(3) rather than consecutive 0,1,2,(3)
; to make the code faster/smaller).
; array runs from array_start to array_start+0x3F in every bank.
; array_start must be at least 0x20 to avoid hitting SFRs.
; array start must be at most 0x30 to avoid hitting the shared RAM.
array_start EQU 0x20; any number from 0x20 to 0x30 is OK.
; copy bit 6 of offset to bank select bit IRP
bcf  STATUS, IRP
btfsc offset, 6
bsf  STATUS, IRP
; copy rest of offset, except for bit 6, into W
movlw  0xBF ; ~(1 << 6 )
andwf  offset,w
; (If offset too big, we end up pointing into)
; ( "unimplemented -- reads as zero" in "bank 4". Harmless. )
; tweak to keep it in general RAM, avoid pointing to SFRs
movwf FSR
; now ready to read or write the byte at that offset using INDF.
; 7 instructions, 7 cycles ...
```

If you want an array bigger than 192 bytes, I suspect you'll want to upgrade to a processor bigger than the 'F688, instead of fiddling around with more complex ways to fill up the 256 bytes of RAM.

```; offset of byte (0..0x7F)(0..127) in array is in "offset".
; Set up FSR so we can read and write that byte through INDF.
; array runs from
; array_start to array_start+0x3F in bank 0,
; array_start+0x80 to array_start+0x80+0x3F in bank 1.
; array_start must be at least 0x20 to avoid hitting SFRs.
; array start must be at most 0x30 to avoid hitting the shared RAM.
array_start EQU 0x20; any number from 0x20 to 0x30 is OK.
; force bank 0 or 1
BANKISEL array_start; bcf  STATUS, IRP
movf offset,w
; if too big to fit in bank 0, fit "second half" into bank 1
btfsc offset,6
addlw 0x40 ; skip over end of bank 0 and SFRs in bank 1
movwf FSR
; now ready to read or write the byte at that offset using INDF.
; 6 instructions, 6 cycles ...

```

2002-07-25:DAV: David Cary started.