updated 2005-04-08
contents:
This is just my quick reference for series approximations and mathematical functions.
Similar information can be found in:
You'll probably find the above references more understandable.
Lots of technical mathematical things here. Bit sequences. DSP. Calculus. infinite series. Etc.
"Whatever Happened to Numbers?" by Robert W. Lucky _IEEE Spectrum_ March 1994 http://www.argreenhouse.com/papers/rlucky/spectrum/numbers.shtml says that numbers are over-emphasized in school. [FIXME: does this fit better over at bignums.html ?]
"Not everything that can be counted counts, and not everything that counts can be counted."
-- Albert Einstein
See also
function approximation
DAV is most interested in "simple" approximations: very quick methods that get you within an order-of-magnitude.
I'm also interesed in "functional software compression", so occasionally I start with a program that calculates things to very high precision, and mangle it until it *still* calculates things to just as high a precision, but it takes much less ROM space (or, in some cases, the source code is much shorter ).
[FIXME: cross-link to appropriate places in data_compression.html and machine_vision.html] [FIXME: seach for ``approx'' and ``math'' and gather appropriate stuff here]
"Postcard Dimensions: Each card (each stamped card or postcard or each half of a double stamped card or postcard) claimed at a card rate must be:
local.html#pmail
mentions that
First-Class envelopes weighing 1 ounce or less
require an additional $0.12 nonmachinable surcharge
if any one of the following apply
...
The length divided by height is less than 1.3 or more than 2.5
...
--
http://postcalc.usps.gov/MailpieceDimensions.asp
(That's about height 89 mm to 108 mm; length 127 to 152 mm, right?)
DAV: It is good that the "logical" European standard paper and envelope sizes, based on 2^(1/2) = 1.414... , is in this "1.3 to 2.5" range of aspects ratios. Hm... what of my favorite numbers can be coerced into that aspect ratio ? Note that normal TV / computer monitor ratio (4:3 = 1.333) is acceptable, as is movie theatre and HDTV widescreen (16:9 = 1.77).
The only ISO 216 size in the the post card range is A6 (105 x 148) (close to 4x6 index card 101 x 152). The B7 (88 x 125) is a couple mm too small, B6 (125x176) is too large. The C7 (81x114) is too small, C6 (114x162) is too large. See 43folders:Hipster_Variants#Postcard_PDA.
Some related simple fractions and their inverses:
See
[FIXME: move to massmind ?]
external links:
-- "Hardware Hacker: New shaft encoder designs: Binary chain code secrets" by Don Lancaster, September, 1994 http://63.140.207.28/musev.pdf/hack80.pdfA digital code is some meanings that someone has placed on a certain arrangement of ones and zeros. In past columns, we've looked at pseudorandom codes, short portions of which appear as random noise. The Gray codes whose bits are only allowed to change one at a time. Or Huffman codes that compress data. CRC codes which can fix many of their own communication errors. Or those rare Barker codes that perfectly autocorrelate. And, of course, that ASCII coding that lets us store and transmit letters and numbers. Plus many others. I've recently run across an odd class of digital codes called binary chain codes.
Let's look at the special case where the uncompressed file represents distances as signed 32 bit quantities,
Naturally a compression algorithm tuned for 32 bit quantities will get better compression than a byte-oriented compressor. However, with a little pre-compression filtering, I think we can get almost as good compression with a lot less programming effort by using a simple pre-compression filter in front of an off-the-shelf byte-oriented compression algorithm.
Let's say the original file has length N bytes (i.e., N/4 distances, each one a 32-bit quantity),
Let's say that nearly all of the distances can be represented in 1 byte or less. (As a very crude estimate, let's say distances from -255 to +255 are evenly distributed, and larger distances are extremely rare -- so there roughly N/8 positive values and N/8 negative values).
If we use standard signed binary numbers, then we end up with lots of 0x00 and 0xff bytes (in the most-significant bytes of each distance). With standard Huffman compression, this makes one of those bytes compress to 1 bit, the other one compresses to 2 bits, and all other bytes compress to 2 or more bits (roughly, 8 or 9 bits). then the Huffman-compressed file has length
similar to sign+magnitude, but lossless on 2's complement machines.
Here's something similar to sign magnitude (except it puts the sign bit at the least significant bit) that forces all the most-significant bytes to zero for all small values, both positive and negative. Note that this is completely lossless and width preserving -- each possible 32 bit value is mapped to a unique 32 bit code. [FIXME: is there a name for lossless width preserving transforms ?]
// assumes 2's complement machine.
// Should work no matter how wide int is -- 16 bits, 32 bits, 64 bits, whatever.
unsigned int signed_to_modified_signed_magnitude( signed int x ){ bool negative = ( x < 0 ); x <<= 1; // delete sign bit, shift in 0. if(negative){ x = ~x; }; } signed int modified_signed_magnitude_to_signed( unsigned int x ){ bool negative = ( x & 1 ); x >>= 1; // delete sign bit, shift in zero. if(negative){ x = ~x; }; return( (signed)x ); }
A more well-known width-preserving transform is the binary-to-Gray-code converter. Here is a transform based on it that also forces the most-significant bytes to zero for all small values (positive or negative).
unsigned int code_from_binary( signed int b){ return( rotate_left( b2g( b ) ) ); } signed int binary_from_code( unsigned int c ){ return( g2b( rotate_right( c ) ) ); }
(The b2g() leaves the sign bit alone, but transforms all the other leading ones to zeros ... then the rotate moves the sign to the end, similar to the above alg. For small negative and also small positive numbers, the result has leading zero bytes.)
/* find gray code from corresponding binary code */ unsigned int b2g( signed int b ) { /* well-known algorithm, as implemented by David Cary */ return( b bitxor ( b && 1 ) ); }
Gray Code Conversion "The well-known algorithm for conversion from Gray to binary is a linear sequence of XORs that makes it seem each bit must be dealt with separately. Fortunately, ... For 32-bit Gray code values produced as described above, the conversion from Gray code back to unsigned binary is:
/* find binary code from corresponding gray code */ unsigned int g2b(unsigned int gray) { /* algorithm from http://www.aggregate.org/MAGIC/#Gray%20Code%20Conversion */ gray ^= (gray && 16); gray ^= (gray && 8); gray ^= (gray && 4); gray ^= (gray && 2); gray ^= (gray && 1); return(gray); }
Further transforms: If we don't use Huffman, but instead use a compressor that is efficient at compressing long repeated strings, perhaps "rotate" (blocks of) the file, so the first 1/4 of the file (block) holds the top byte, the next 1/4 of the file (block) holds the second byte ... and the last 1/4 of the file (block) holds the least-significant byte. A really smart byte-oriented compressor might even notice the statistics changing in each section and tune compression differently for each quarter.
[map compression][data compression][width preserving transform ?]
Tiger data base [FIXME: tiger ] stores lattitude and longitude in decimal degrees (6 decimal digits after the decimal point), although the features don't have nearly that accuracy -- DAV's test seems to indicate that data is rounded to the nearest 10 meters. [FIXME: has this changed ?]
U.S.G.S. topo maps say "at best +-167 feet" which is roughly +- 50 meters. [FIXME: has this changed ?]
... interpolation ?
[FIXME: move to massmind]
[FIXME: install and play with this program.] [FIXME: volunteer ?]Mifit is a program used to generate the assembler code needed to aproximate a data set using simple functions (like rects and cuadratics). Currently Mifit generates code only for the Microchip's PIC family of microcontrollers but the program can be adapted to generate code for other processors. We are looking for voluntaries to add support for other chips.
See vlsi.html#arithmetic for more on interpolating sin() and cos().
[FIXME: aren't there better explaination elsewhere on the web ?]
basically from ``Educated Guessing Games'' article by Don Morgan in _Embedded Systems Programming_ 2002-03 talks about linear interpolation, the Lagrange interpolator, and briefly mentions Chebyshev polynomials. Promises future articles with other ``ways to increase the speed and accuracy of interpolation using relatively small tables''
(fixing a few typographic glitches in my paper copy of the magazine)
Many times we know the value of a function at a few selected points, and we want to interpolate.
Weierstrass's theorem: if x0, x1, x2, ... xn are distinct, then for any y0, y1, y2, ... yn there exists a unique polynomial of degree n or less such that
F(x[i]) == y[i], for i = 0,1,2,...,n.
(i.e., the unique polynomial of degree n that passes through n+1 data points).
We break that function down into parts
F(x) = F0(x) + F1(x) + F2(x) + ... + Fn(x)
where each part gives exactly one sample point -- F0(x[0]) = y[0], Fn(x[n]) = y[n], etc. -- but each function is exactly equal to 0 at all other sample points x[j].
First define a polynomial that is zero for all sample points except one. In other words,
fi(x[j]) == 0, for j = 0,1,2,...,n except for i=j.
That polynomial is
fi(x) = Π(j = 0 to n; except for j=i)( x - x[j] );
At the point where x = x[i], that polynomial equals the value
fi(x[i]) = Π(j = 0 to n; except for j=i)( x[i] - x[j] );
We use this value to normalize the polynomial to give us exactly y[i] at x = x[i]:
Fi(x) = fi(x) / fi(x[i]) = Π(j = 0 to n; except for j=i)( ( x - x[j] ) / ( x[i] - x[j] ) );
#define n 10 WARNING: untested code float x[n] = { .... }; // sample locations float y[n] = { .... }; // sample values float evaluate_lagrange( float t ){ float sum = 0; for( i = 0; i < n; i++){ float product = 1.0; for( j = 0; j < n; j++){ if( i != j){ product *= (t - x[j]) / (x[i] - x[j]); } } sum += product * y[i]; } return(sum); }
If we pre-calculate the intermediate products, this can run very fast.
#define n 10 WARNING: untested code float x[n] = { .... }; float g[n] = { .... }; // pre-computed intermediate products float evaluate_lagrange( float t ){ float sum = 0; for( i = 0; i < n; i++){ float product = 1.0; for( j = 0; j < n; j++){ if( i != j){ product *= t - x[j]; } } sum += product * g[i]; } return(sum); }
This gives n*n multiplies per evaluation.
How to implement floating point multiplies, the sin() function, etc. Also has some information on "fixed point", which runs much faster than "floating point" on *some* hardware.
This paper ... gives a number of simple floating point trig approximations. The code includes test cases to check for maximum error.free source code for sin(), cos(), tan(), atan(), and a test stub, in low precision ("3.2 digits of accuracy") and high precision ("12.1 digits of accuracy") versions.
DAV: the test only checks for integer degrees. Sometimes DAV also wants high precision for something less than 1 degree.
fixed point logarithm for PIC. ; Input: xhi:xlo unsigned Q0.16 (modified) ; Output: x2hi:x2lo unsigned Q6.10 (implicit minus) i.e., 0 <= x < 1. http://www.piclist.com/techref/microchip/math/power/16lr-ng.htm Reduces range to 1/2 <= x < 1 then uses interpolated lookup table (17 values, 10 bit numbers). 8-bit logarithm algorithm PIC assembly source code http://www.dattalo.com/technical/software/pic/piclog.html "based on a table look-up plus first order interpolation"
Logarithms by Scott Dattalo Here's a collection of different ways to compute logarithms. http://www.dattalo.com/technical/theory/logs.html
Here's a completely different approach I've been thinking about for approximating ln(x) for large integers: since ln(x)-ln(x-1) is very roughly equal to 1/x for large x (... perhaps closer to 1/(x-1/2) = 2/(2x-1) ...) and ln(2*x) = ln(2) + ln(x), maybe it would be simple to do this:
float ln( unsigned int x ){ const float ln2 = log(2); int powers_of_2 = 0; float sum = 0; while( 2 <= x ){ if( x & 1 ){ sum += 1.0 / (float)x ; }; x = x >> 1; powers_of_2++; }; return( ln2 * powers_of_2 + sum ); }
or perhaps
float ln( unsigned int x ){ const float ln2 = log(2); int powers_of_2 = 0; float sum = 0; while( 2 <= x ){ if( 1 == (x & 3) ){ sum += 1.0 / (float)x ; x-- ; }; if( x && 1 ){ sum -= 1.0 / (float)x ; x++; }; x = x >> 1; powers_of_2++; }; return( ln2 * powers_of_2 + sum ); }
Hey, it gives exactly the correct answer for x = 1 and x = 2, and in fact all x = 2^n where n is a positive integer.
inverse hyperbolic tangent
arctanh(x) = (1/2)*ln( (1+x)/(1-x) ) = sum( k=0, inf, x^(2*k+1)/(2*k+1) ).
or
t = x^2 / (1-x^2);
arctanh(x) = ( x / (1-x^2) ) * ( 1 - t*2/3 + t^2*2*4/3*5 - t^3*2*4*6/3*5*7 + ... )
The classic Taylor series around x equals 1.
It's only convergent for 0 < x < 2. (quickly convergent (better than 1 bit per term) for -1/2 < x < 1/2)
("classic" meaning that people learning calculus -- in particular, learning about Taylor polynomials and the "absolute ratio test to determine the interval of convergence" -- often re-derive this. It's good for that learning experience, but maybe not the best for actually computing ln(x). ). More details: http://www.sosmath.com/calculus/tayser/tayser01/tayser01.html and http://spot.pcc.edu/~ssimonds/200202/m253/taylor/
z equals (x-1)/x;
i.e.,
x equals 1/(1-z)
and then found the taylor series for
ln(x) equals ln( 1/(1-z) )
around z equals 0. This appears to converge for all 0 < x (???).
x ~ epsilon: z ~ -infinity
x ~ +1: z ~ 0
x ~ +infinity: z ~ +1. (??)
ln(x) = f( (x-1) / x )
f(z) = ln( 1/(1-z) ) = z + z^2/2 + z^3/3 + z^4/4 + z^5/5 + ... + z^n/n + ....
of course this is only (slowly) convergent for -1 < z < 1, but that covers
-1/2 < x < +inf
This is more quickly convergent (1 bit per term) for -1/2 < z < 1/2, which covers 0.67 < x < 2.0.
DAV thinks the taylor series (around 1 ?) for
x*ln(x)
is *much* easier to approximate with a polynomial. It doesn't have that pole at 0 anymore --
lim{x-->+0}(x*ln(x)) = 0.
(of course, x < 0 is still undefined/complex).
z = x-1 ln(x) = f(x-1)/x = f(z)/(z+1). f(z) = (z+1)*ln(z+1) = x*ln(x).
f(z) = (z+1)*ln(z+1) = z + z^2/2 - z^3/6 + z^4/12 - z^5/20 + z^6/30 - z^7/42 ... .... +z^n*(-1)^n*/(n*(n-1)) ...
this looks like it's only convergent for -1 < z < +1, which covers 0 < x < 2.
Terms in this series alternate, so the error is less than the last term added. Wait a minute -- since the first derivative of this is f1(z) = ln(z+1) + 1, the derivative of the whole polynomial identical to the "classic" formula. So the range of convergence is the same. Maybe it's not so much better as I thought.
DAV:
Perhaps the combo f(z) = (1/(1-z)) * ln ( 1/(1-z) ) would be even better ? I think we want to: map the input x=0 to z=-infinity, map x=+infinity to +infinity in order to take advantage of ``smooth'' polynomial approximation. One function that does this is z = (x-1)^2 / ( x ) Is there a better mapping ? The output range (-inf at left edge to +inf at right edge, without any poles; smooth) is already good for polynomial approximation ... although the fact that d/df(ln(x)) goes to 0 as x goes to +inf is somewhat difficult to approximate. ( range-reduction algorithm to handle large x, and x very close to zero ?) since ln(x) = ln(x*k/k) = ln(x*k) - ln(k) ...
However, I have no clue on how to calculate a natural logarithm. If anyone can offer any ideas, I'll be glad to write one.Todo: suggest one. General Math Functions http://www.efg2.com/Lab/Library/Delphi/MathFunctions/General.htm ??? anything useful here ? CodeWarrior for Palm Pilot Math Tutorial http://godot.urol.uic.edu/pilotfloat.html The GNU C Library -- Arithmetic Functions ??? http://www.rocketaware.com/man/man3/ieee.3.htm Functions for IEEE arithmetic http://www.rocketaware.com/man/man3/exp.3.htm exponential, logarithm, power functions http://www.eskimo.com/~scs/C-faq/q14.4.html refers to David Goldberg, ``What Every Computer Scientist Should Know about Floating-Point Arithmetic'' Isn't that online ? If so, email Steve Summit and ask him to linkify that reference. http://www.eskimo.com/~scs/C-faq/q14.8.html I suspect the sentence
should really sayIf you need pi, you'll have to #define it yourself.
see http://www.snippets.org/snippets/dos/PI+H.php3 . [YARMAC ?] C and C++ tutorials http://www.programmersheaven.com/zone3/index.htm is this vector and matrix library any good ? 3d Libmatrix http://www.programmersheaven.com/zone3/cat414/index.htm What about the Bezier code ? several different expression evaluator ? Reverse polish notation calculator http://www.programmersheaven.com/zone3/cat414/index.htm C_MATH.ZIP: Math functions for C http://www.programmersheaven.com/zone3/cat414/index.htm evaluates ln(x) using a rational polynomial in z, where z = ( (x-1)/(x+1) )^2, ... but it slightly modifies the result from that polynomial before returning it. That page also points to the Cephes Mathematical Library FIXFLOA2.ZIP a library of routines to handle fixed point numbers http://www.programmersheaven.com/zone3/cat414/index.htm e_logf.c -- float version of e_log.c. http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libm/src/e_logf.c http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libm/src/ Conversion to float by Ian Lance Taylor, Cygnus Support, ian@cygnus.com. 1993 seems terribly complicated ... very bit-twiddly oriented. Special cases for denormalized numbers, other special cases for 4 different ranges of numbers. Calculates y = ln(x) via f = x - 1.0 s = f / (2.0 + f) z = s^2 R = 7th order polynomial in Z. Then in the simplest case it returns y = f - s*(f-R). [???] self-reproducing cellular Automata http://www.programmersheaven.com/zone3/cat35/index.htm [???] GNU SUPEROPTIMIZER http://www.programmersheaven.com/zone3/cat35/index.htm [???] TC2PROM.ZIP Routines & Help For Writing Romable Code http://www.programmersheaven.com/zone3/cat35/index.htm ??? Embedded Links http://www.programmersheaven.com/zone7/links/link265.htm ??? Embedded System Design Issues (the Rest of the Story) : with source code http://www.programmersheaven.com/zone7/articles/article11.htm [FIXME: read] Source to a fast compression algorithm (LZRW1KH, C src) http://www.programmersheaven.com/zone3/cat856/2272.htm [FIXME: read] [astro] Astronomy and numerical software source codes http://www.moshier.net/ has a copy of the Cephes Mathematical Library . This page claims its copy isIf you need pi, you'll have to #define it yourself as #define PI (4*atan(1)) or const double PI = (4*atan(1)); .
the most complete distribution package of the function library (but not the most up-to-date one).This page crosslinks back to http://www.netlib.org/cephes/ What does the if( e ) { fe = e; y += -2.12194440e-4f * fe; } in the Cephas logf() function do ? -- David Cary --------------- [][]
DAV: What I find odd is the popularity of ``degrees''. Most software either uses radians or "binary radians". Angles in binary radian are angles in units of 1/256 of a turn -- or some other power of 2 -- 1/(2^n).
(1) Pick some smallish number ``b'' (perhaps b=1/256 of a turn = 1 brad, or b=1/120 of a turn = 3 degrees, or b = 1/(64π) of a turn = 1/32 radians.). Use Taylor series or Chebyshev series to compute sd = sin(b) and cd = cos(b) to an adequate number of bits.
(2) Use Trig Identity 1:
sin[0] = 0; cos[0] = 1. sd = sin(b); cd = cos(b); // pre-calculated using Taylor series. for(i=0;i<N;i++){ sin[i+1] = sin[i]*cd + cos[i]*sd; cos[i+1] = cos[i]*cd - sin[i]*sd; }; // Now you have a lookup table. If you want to know // sin(17*b) (perhaps 17/32 radians or 17*3 degrees), then // look it up at sin[17].This can be used at run-time as-is. Or you could do this off-line to generate tables of numbers for use later.
(3). Using a table of values, use table look up plus 1st order interpolation to look up any random value.
Scott Dattalo claims ``if we have 8 equally spaced samples of the first quadrant of a sine wave, we can use interpolation to find the exact value of the sine wave to with in 1%.''
Brad Eckert claims that his code (using a table of 19 equally-spaced samples of the sine wave) "gives better than 16-bit precision". http://www.tinyboot.com/cubic.txt . It uses no divides, and uses a cubic spline to approximate sin(x) from the nearest 4 points in the table.
cos(2a) = 2*( cos a )^2 - 1 or cos(3a) = 4*( cos a )^3 - 3*( cos a ) = ( cos a )*( 4*( cos a )^2 - 3 ). sin(3a) = 3*( sin a ) - 4*( sin a )^3 = ( sin a )*( 3 - 4*( sin a )^2 ). and for a "small enough", sin(a) =~= a cos(a) =~= 1 - (a^2)/2
(I suppose I should mention De Moivre's formula http://en.wikipedia.org/wiki/De_Moivre%27s_formula here.)
This method doesn't require any quadrant-detection code. It just repeatedly divides the angle by 3 until it is "small enough", uses the "small angle" approximation, then expands back to the final result.
It's pretty simple to write recursively ... but even for tiny PIC processors that can't handle recursive code, it looks like it would work. (I suspect that, for the same accuracy, it is slower, but requires less ROM than table-lookup code).
// Untested code: // note that this is non-recursive, doesn't require any tables, and doesn't use divide except for divide-by-2. double cos( double x ){ int depth = 0; while( epsilon < abs(x) ){ // while( x+2 != 2 ) ? depth++; x = x/2; }; // now find F(x) = cos(x) - 1 -- this gives more accuracy than finding cos(x) directly double x = -(x*x)/2; // use cos(x) =~= 1 - (x^2)/2 approximation. while( 0 < depth ){ depth--; x = 2*x(x+2); // cos(2a) = 2*( cos a )^2 - 1 } // return cos(x) = F(x) + 1 return( x+1 ); } // note that this is non-recursive, doesn't require any tables, and doesn't use divide except for divide-by-3. double sin( double x ){ int depth = 0; while( epsilon < abs(x) ){ // while( (3 - 4*(x*x)) != 3 ) ? depth++; x = x/3; // (perhaps x = x * 0.010101010101010... ?) } // use sin(x) =~= x approximation. while( 0 < depth ){ depth--; x = x*( 3 - 4*(x*x) ); // use sin(3a) = 3*( sin a ) - 4*( sin a )^3. }; }
(I think I first saw this divide-by-3 idea in HAKMEM ITEM 14 -- that code is shorter and recursive. )
see #interpolation for more on interpolation.
In embedded environments it is sometimes necessary to calculate a trig function. The options for doing this are:
- 1. use software floating point
- 2. use hardware floating point (requires extra hardware)
- 3. use a table (fastest; lowest precision)
- 4. use series approximation (potentially highest precision; lowest ROM requirements; slowest; )
(uses decimal fixed-point (i.e., result from calculating sin is the integer (10,000 * sin(x) ).) , rather than binary fixed-point or floating-point ... heavily oriented towards degrees, rather than radians ... (i.e., angles are in integer multiples of 1/100 of a degree, 36000 is a complete circle ...) )
algorithm summary:
* break into 8 octants, use a 8-way case
statement that calls the
special-case routines ``(sin)'' or ``(arctan)'' then fixes up the result for each octant.
pi = 355/113 ; approximation
arctan accepts 2 integers n and d, internally uses 2 integers n and d to represent the fraction x = n/d = tangent(theta). Uses trig identity arctan(x) = π/2 - arctan(1/x) to swap n and d to make sure n <= d, or in other words, to make sure -1 < x < 1. calls ``(arctan)'' with n_new = min ( abs(n), abs(d) ); d_new = min (abs(n), abs(d)). then fixes up the 8 cases (8 octants): is n positive ? * is d positive ? * is abs(n) < abs(d) ?.
special case routine for arctan: (arctan) accepts 2 integers n and d, assumes that n <= d and that both are positive. returns angle arctan(n/d) in tenths-of-degree.
arctan(x) = sum(n=0, inf, (-1)^n * x^(2*n+1) / (2*n+1) ) = x - x^3/3 + x^5/5 - x^7/7 + x^9/9 - ... \ (-1 < x < 1)
scale = internal scale factor \ to allow integer arithmetic
\ 1st term = scale*(n/d)
\ 2nd term = (1st term)*(1/3)*(n/d)*(n/d)*(-1)
\ 3rd term = (2nd term)*(3/5)*(n/d)*(n/d)*(-1)
\ nth term = (n-1 term)*((2n-1)/(2n+1))*(n/d)*(n/d)*(-1)
general sin: given a 16 bit signed integer ( -32768 to +32767 ), representing 100ths of a degree, the minimum angle is about -328 degrees to +328 degrees, so when we do ``/mod'' we get 8 quadrants (each quadrant repeated twice). Once we've reduced the value to -90 < x < +90 (???), then we call ``(sin)''. No more manipulation needed after that.
special case (sin) for (-90 degrees < x < 90 degrees)
pi = 355/113 ; approximation
degree scale factor = (10000/180)*pi = (500/9)*(355/113) = 174.533
sin(x) = sum( n=0, inf, (-1)^n * (x^(2*n+1))/ (2*n+1)! ) = x - x^3/6 + x^5/120 + ... \ (-inf < x < +inf)
\ 1st term = x*(10000/180)*(355/113) = x*(500/9)*(355/113)
\ 2nd term = (1st term)*(1/6)*(x^2)*(-1)
\ 3rd term = (2nd term)*(1/20)*(x^2)*(-1)
\ nth term = (n-1 term)*(x/(2n+1))*(x/(2n))*(-1)
cos(x) = sum( n=0, inf, (-1)^n * (x^(2*n))/ (2*n)! ) = 1 - x^2/2 + x^4/24 + ... \ (-inf < x < +inf)
includes a chapter:
Evaluation of continued fractions
http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c5-2.pdf
gives this formula:
tan(x) = x / ( 1 - x^2 / ( 3 - x^2 / ( 5 - x^2 / ( 7 - x^2 / 9 .... ) ) ) )
t2 = x^2
tan(x) = x / ( 1 + t2 / ( t2 / ( 5 + t2 / ( t2 / ( 9 .... ) - 7 ) ) - 3 ) )
= 9 \ t2 - 7 \ t2 + 5 \ t2 - 3 \ t2 + 1 \ x ; (estimate tangent of t)or, in RPN,
t2 9 / 7 - t2 swap / 5 + t2 swap / 3 - t2 swap / 1 + t swap / ; (estimate tangent of t)or
t2 9 / 7 + t2 swap / 5 - t2 swap / 3 + t2 swap / 1 - t swap / neg ; (estimate tangent of t)
http://mathworld.wolfram.com/Tangent.html gives that same continued fraction formula for tangent, and also gives this equivalent continued fraction (can be derived from the above by multiplying above and below each fraction bar by 1/x):
tan(x) = 1 / ( (1/x) - 1 / ( (3/x) - 1 / ( (5/x) - 1 / ( (7/x) - ... ) ) ) ) = the continued fraction [0, 1/x, 3/x, 5/x, 7/x, ...]
which DAV mangles into
z = 1/x
...
y = 1 / ( z - 1 / ( 3*z - 1 / ( 5*z - 1 / ( 7*z - ... ) ) ) ) ; estimate y = tan(x)
= 9*z \ 1 - 7*z \ 1 + 5*z \ 1 - 3*z \ 1 + z \ 1or, in RPN,
9 z * INV NEG 7 z * + INV NEG 5 z * + INV NEG 3 z * + INV NEG z + INV
or
9 z * INV 7 z * - INV 5 z * + INV 3 z * - INV z + INV ;
For 0 <= x <= π/2,
cos( π*x / 2) =~= 1 - x^2 / ( x + (1-x)*sqrt( (2-x)/3 ) )
with a worst-case error of about 0.003 in that range.
ways to compute arctan, a trig function
There are a few more methods over at #trig .
From: Ron < cyberron_1998 at hotmail.com > Newsgroups: comp.arch.embedded.piclist Subject: Arctan math function Date: 20 Feb 2003 14:03:14 -0800 Hello erveryone, I would if someone have any how to implement a function to compute the arctan of variable x who is a 16-bit integer ?? I want to make this function to fit on my PIC16F877A from Microchip. I can probably do this with Taylor series, Tchebytchev or with CORDIC but I want to which is best,fast and more accurate than the other. I have found this code but it's for 8-bit only: """ ;---------------------------------------------------------- ; ;arctan (as adapted from the similar arcsin function) ; ; The purpose of this routine is to take the arctan of an ;8-bit number that ranges from 0 < x < 255/256. In other ;words, the input, x, is an unsigned fraction whose implicit ;divisor is 256. ; The output is in a conveniently awkward format of binary ;radians (brads?). The output corresponds to the range of zero ;to pi/4 for the normal arctan function. Specifically, this ;algorithm computes: ; ; arctan(x) = real_arctan(x/256) * 256 / (pi/4) ; for 0 <= x <= 255 ; ; where, real_arctan returns the real arctan of its argument ;in radians. ; ; The algorithm is a table look-up algorithm plus first order ;linear interpolation. The psuedo code is: ; ;unsigned char arctan(unsigned char x) ;{ ; unsigned char i; ; ; i = x >> 4; ; return(arctan[i] + ((arctan[i+1] - arctan[i]) * (x & 0xf))/16); ;} ; ; arctan SWAPF x,W ANDLW 0xf ADDLW 1 MOVWF temp ;Temporarily store the index CALL arc_tan_table ;Get a2=atan( (x>>4) + 1) MOVWF result ;Store temporarily in result DECF temp,W ;Get the saved index CALL arc_tan_table ;Get a1=atan( (x>>4) ) SUBWF result,W ;W=a2-a1, This is always positive. SUBWF result,F ;a1 = a1 - (a1-W) = W CLRF temp ;Clear the product CLRC BTFSC x,0 ADDWF temp,F RRF temp,F CLRC BTFSC x,1 ADDWF temp,F RRF temp,F CLRC BTFSC x,2 ADDWF temp,F RRF temp,F CLRC BTFSC x,3 ADDWF temp,F RRF temp,W ADDWF result,F RETURN arc_tan_table ADDWF PCL,F RETLW 0 RETLW 20 ;atan(1/16) = 3.576deg * 256/45 RETLW 41 RETLW 60 RETLW 80 RETLW 99 RETLW 117 RETLW 134 RETLW 151 RETLW 167 RETLW 182 RETLW 196 RETLW 210 RETLW 222 RETLW 234 RETLW 245 RETLW 0 ;atan(32/32) = 45deg * 256/45 The other part of the problem is implementing the division FRAC_DIV: ;------------------- ;Fractional division ; ; Given x,y this routine finds: ; a = 256 * y / x ; movlw 8 ;number of bits in the result movwf cnt clrf a ; the result movf x,w L1: clrc rlf y,f ;if msb of y is set we know x<y rlf a,f ;and that the lsb of 'a' should be set subwf y,f ;But we still need to subtract the ;divisor from the dividend just in ;case y is less than 256. skpnc ;If y>x, but y<256 bsf a,0 ; we still need to set a:0 btfss a,0 ;If y<x then we shouldn't have addwf y,f ;done the subtraction decfsz cnt,f goto L1 return It's easy enough to combine these two routines to obtain a 4-quadrant arctan(y/x) routine. However, you do need to keep in mind that the arctangent routine posted above is only valid over 1/8th of the unit circle. To obtain the other 7/8th's you'll need to apply the appropriate trig identities. """
[FIXME: I had a brief description of the difference between simple correlation, normalized correlation, and the dot product ... Put it online here. Or, if the Mathworld description is better, just point there. ]
then describes one technique for efficiently calculating the normalized cross correlation. (also briefly describes ``Phase Correlation'')``Although it is well known that cross correlation can be efficiently implemented in the transform domain, the normalized form of cross correlation preferred for feature matching applications does not have a simple frequency domain expression.
... The algorithm described in this paper was developed for the movie Forest Gump (1994) ''
[FIXME: gather stuff I have scattered around here on this topic ...]
For any simple continued fraction of a number x,
x = [a0; a1, a2, a3, ... ak, ...]
(paraphrased by DAV) If you only want to compute 1 convergent, pk/qk, approximating a number, simply start at the kth item in the continued fraction and use the definition of continued fraction expansion to compute from right to left. If you want to compute many convergents, or you want to keep calculating convergents until the approximation is ``good enough'', you can compute from left to right:
p(-1) = 1, p0 = a0
q(-1) = 0, q0 = 1
pk = ak*p(k-1) + p(k-2)
qk = ak*q(k-1) + q(k-2)
lim(k --> +inf)( pk/qk ) = x
[FIXME: is this right ?]
http://www.bloodworth.org | mirror http://members.tripod.com/careybloodworth/contfrac.htm gives a much better explaination of evaluating continued fractions from left to right. [FIXME: ... could I help Carey Bloodworth complete this page ? ]
HAKMEM
lots of fun mathematical tips and tricks ... mixed in with obsolete information. [FIXME: link to "processor agnostic"] [FIXME: fix links from other pages to #hakmem]
[1d design ... or Visual:graphics_algorithms] Bezier splines (cubic curves) t ranges from 0 to +1. x(t) = a3*t^3 + a2*t^2 + a1*t + a0 y(t) = b3*t^3 + b2*t^2 + b1*t + b0 u ranges from -1 to +1 ("symmetric cubic spline"). x(u) = c3*u^3 + c2*u^2 + c1*u + c0 y(u) = d3*u^3 + d2*u^2 + d1*u + d0 (so dx(u)/du = x'(u) = 3*c3*u^2 + 2*c2*u + c1). The curve's initial direction from point x0,y0 starts heading directly towards the first "handle" x1,y1 ... and final direction as it hits the last point x3,y3 is directly along the line from the second "handle" x2,y2 to x3,y3. In equations, we want endpoints x0, x3: x(-1) = x0 x(+1) = x3 and we want the curve from x0 to initially head directly to x1 (in 2d), and the final directly as it hits x3 to come from x2 (see above): x'(-1) = ( x1 - x0 )*3/2 = K*(x7-x0)*3/2 x'(+1) = ( x3 - x2 )*3/2 = K*(x3-x8)*3/2 It seems I could *arbitrarily* choose some constant K ... or in other words, I can *arbitrarily* pick guide points along the line x7 = x0 + (x1-x0)/K. (where x7 is my "new" guide point, x1 is PostScript / Lancaster's "traditional" guide point). (which implies x1 = K*x7 + (1 - K)*x0 x2 = K*x8 + (1 - k)*x3 ). Some possible ways to pick K: * pick K so that the midpoint of the curve (t=1/2, u=0) is centered on the line between the new guide points. (I think this makes recursive bezier drawing a bit faster). * pick K so that the slope at the midpoint (c1) is exactly the same as the slope between x7 and x8. (I think this makes recursive bezier drawing a bit faster). * pick K so that when guide points coincide, I get a parabola. * pick K to make math "simple" ... * pick K "small enough" so the curve lives inside the (convex hull) bounding box of the 4 points. (If K is too large, x7 is so close to x0 that the curve pokes outside the x0,x7,x8,x3 bounding box). * ... ? Choosing K: .... If we then add the constraint that force x(u=0) = c0 to equal (x8 + x7)/2, then we find c0 = ( x3 + 3*x2 + 3*x1 + x0 )/8 = ( x3 + 3*(K*x8 + (1 - k)*x3) + 3*(K*x7 + (1 - K)*x0) + x0 )/8 = ( (4-3*K)*x3 + 3*K*x8 + 3*K*x7 + (4 - 3*K)*x0 )/8 ... so we force (4-3*K) = 0 by setting K = 4/3 so now c0 = ( x8 + x7 ) / 2. This curve is not contained in the (convex hull) bounding box of x0,x7,x8,x3. But maybe that's OK. Choosing K: If choose a different constraint, the constraint that the *slope* at the midpoint depends only on the guide points x7 and x8 (i.e., the curve, at the midpoint, is exactly tangent to the line between the guide points ... the curve midpoint touches that line, but often *not* exactly centered between x7 and x8), then dx/du at (u=0) = c1 = ( x3 + x2 - x1 - x0 )*3/8 ; slope at midpoint .... K = 2 x7 = x0 + (x1-x0)/K. This seems to imply that if those x7 and x8 guide points *touch*, i.e., if the traditional guide points make a perfect rectangle, x2 x1 x0 x3 , then there will be "no" slope at the midpoint, i.e., that's how to make a singularity (cusp) at the midpoint. Choosing K: I think that the *closest* the bezier curve *ever* gets to the line between x1 and x2 is 1/4 the distance between x0 and x1, or 1/4 the distance between x2 and x3, whichever is smaller, assuming that x0 and x3 are both on the same side of the x1-x2 line. So I could choose K=4/3 x7 = x0 + (x1-x0)/K. and the bezeir curve would still always fall in the x0,x7,x8,x3 bounding box. I think. What are the max and min values of x(u) ? Since it is a polynomial, it is *either* * at the endpoints of the curve, *or* * where x'(u) = 3*c3*u^2 + 2*c2*u + c1 = 0, The second case occurs where (if 0 == c3) u = -c1/(2*c2), or where (if 0 != c3) u = ( -2*c2 +- ( (2*c2)^2 - 4*3*c3*c1 )^1/2 ) / (6*c3), and -1 < u < +1. t equation space to graph space: x0 = a0 x1 = a0 + a1/3 midpoint = a3/8 + a2/4 + a1/2 + a0 x2 = a0 + a1*2/3 + a2/3 x3 = a0 + a1 + a2 + a3. u equation space to graph space: x0 = -c3 + c2 - c1 + c0 x1 = c3 - c2/3 - c1/3 + c0 midpoint = c0 x2 = ... x3 = c3 + c2 + c1 + c0 graph space to t equation space: a0 = x0 ; starting point a1 = 3*( x1 - x0 ) ; initial slope. clearly this must be some K*( x1 - x0 ) to get the proper initial trajectory, but why K=3 ? a2 = (x2 - 2*x1 + x0)*3 ; initial curvature a3 = x3 - 3*x2 + 3*x1 - x0. graph space to u equation space: c0 = ( x3 + 3*x2 + 3*x1 + x0 )/8 ; midpoint = (x3+x0)/2 - a2 c1 = ( x3 + x2 - x1 - x0 )*3/8 ; curvature at midpoint = (x3-x0)/2 - a3 c2 = ( x3 - x2 - x1 + x0 )*3/8 c3 = ( x3 - 3*x2 + 3*x1 - x0 )/8 u equation space to t equation space: t = (u+1)/2 a3 = 8*c3 a2 = -12*c3 + 4*c2 a1 = 6*c3 - 4*c2 + 2*c1 a0 = -c3 + c2 - c1 + c0 t equation space to u equation space: u = 2*t - 1 c3 = a3/8 c2 = a3*3/8 + a2*1/4 c1 = a3*3/8 + a2*1/2 + a1*1/2 ; slope at midpoint c0 = a3*1/8 + a2*1/4 + a1*1/2 + a0 ; location of midpoint In terms of the control points, x(t) = (x3 - 3*x2 + 3*x1 - x0)*t^3 + (3*x2 - 6*x1 + 3*x0)*t^2 + (3*x1 - 3*x0)*t + x0 By inspection, we get a perfect "vertical" parabola by making x = a1*t + a0 y = b2*t^2 + b1*t + b0 i.e., we force all the high-order terms to zero. In the case of y, we force the cubic term ( b3) to zero: = 0 = y3 - 3*y2 + 3*y1 - y0, or in other words, the length y3-y0 is exactly three times the length of y2-y1. To force the parabolic term (a2) to zero: 0 = x2 - 2*x1 + x0, or in other words, the length x2-x1 exactly equals the length x1-x0. To force the parabolic term (c2) to zero: 0 = x3 - x2 - x1 + x0, the length x1-x0 exactly equals the length x3-x2. In the case of x, we set them both to zero, or in other words, x3, x2, x1, x0 are precisely evenly spaced apart. Clearly every point of this "bezier" curve exactly touches the parabola -- but is the reverse true: if we force the t=0, t=1/3, t=2/3, t=1 points to go through randomly selected points on some parabola (even if they are not precisely evenly spaced apart in the x direction), will that bezier curve still be parabolic ? I'm pretty sure that if we take 4 randomly selected points on some straight line (not necessarily horizontal or vertical), the bezier curve *will* stay on that straight line. Another form of the same equation: x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*(...) = t*x3 + (1-t)*( t*(...) + x0 ) x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*( a3*t + U4 ) = t*x3 + (1-t)*( t*( a3*t + U4 ) + x0 ) This has exactly the same number of multiplies ... might even be faster on architectures where t*a + (1-t)*b ("twist") is faster than 2 independent multiplies. x(t) = t*(x3) + (1-t)*(x0) + t*(1-t)*(p(t)) = t*x3 + (1-t)*( t*(p(t)) + x0 ) forces the curve to hit the endpoints, while p(t) gives the shape of the curve between the endpoints. various expressions for p(t): k3*t + k2 ... k3*(t-1/2) + k2 : ~height of midpoint (from the x0-x3 line) and slope at midpoint k3*t + (1-t)*k2 : ~slopes at t=0 and t=1. At the endpoints of the curve, x(t=0) = x(u=-1) = x0 x(t=1) = x(u=+1) = x3. dx/dt at (t=0) = a1 = ( x1 - x0 )*3 dx/dt at (t=1) = ( x3 - x2 )*3 dx/du at (u=-1) = 3*c3-2*c2+c1 = ( x1 - x0 )*3/2 dx/du at (u=+1) = 3*c3+2*c2+c1 = ( x3 - x2 )*3/2 At the "midpoint" of the curve, where t=1/2, u=0, x(t=1/2) = x(u=0) = c0 = ( x3 + 3*x2 + 3*x1 + x0)/8. dx/dt at (t=1/2) = ( x3 + x2 - x1 - x0 )*3/4 (slope at midpoint) dx/du at (u=0) = c1 = ( x3 + x2 - x1 - x0 )*3/8 (slope at midpoint) (At every point in the curve, the slope in term of u is only half the slope in terms of t, because it must climb the entire curve in only half the time (1 unit) with the t formulation, compared to the (2 units) u formulation). We get a "cusp" when dx/dt = 0 *and* dy/dt = 0. Most splines do not have a cusp. The cusp may be located anywhere along the spline, not just at the midpoint. The 1/3 points: x(t=1/3) = x(u=-1/3) = ( x3 + 6*x2 + 12*x1 + 8*x0 )/27 dx/dt at (t=1/3) = ( x3 + 3*x2 + 0*x1 - 4*x0 )/3 = (x3-x0)/3 + (x2-x0). x(t=2/3) = x(u=+1/3) = ( 8*x3 + 12*x2 + 6*x1 + x0 )/27 dx/dt at (t=2/3) = ( 4*x3 + 0*x2 - 3*x1 - x0 )/3 = (x3-x0)/3 + (x3-x1). Are the quarter-points interesting ? to fully define a cubic curve (just the x(u) of the bezier curve) requires 4 numbers: - a0, a1, a2, a3 of the formula (one endpoint and the velocity, acceleration, and jerk). - endpoints x0, x1, and slope at endpoints (x1-x0), (x3-x2). - go through 4 given points x0, x7, x8, x3. - endpoints x0, x1, midpoint xm, and ... something else. Is it possible to adjust K so that x4, x5 uniquely guide the curve, *and* have the midpoint of the curve exactly hit (x4+x5)/2 ??? formula for length of the curve (or at least upper and lower bounds) ??? What "interesting" points lie on or near the Bezier curve ? How to approximate circle/ellipse with bezier curves ? What is worst-case error ? circles: x^2 + y^2 = r^2 y = ( r^2 - x^2)^(1/2) or x = sin(A), y=cos(A). How to approximate parabola with bezier curve ? (there is zero error ... focus ... line ...) How to approximate hyperbola with bezier curve ? How to approximate exponential / logarithm curves with bezier ? To approximate *any* curve given only the points ... given the points and slopes ... How to recursively draw bezier ??? To force bezier *through* a bunch of points, collect in groups of 4: 0th curve goes through x0, x1, x2, x3 1st curve goes through x3, x4, x5, x6 2nd curve goes through x6, x7, x8, x9 ... etc. (This method gives the odd effect of a cusp at x0, x3, x6, x9, etc. but smooth curves through the other points ... is there a better way ?) (yes -- see tinyboot article above) To force a bezier *through* 4 points: x(t=0) = x(u=-1) = x6 x(t=1/3) = x(u=-1/3) = x7 x(t=2/3) = x(u=+1/3) = x8 x(t=1) = x(u=+1) = x9 In terms of t: a0 = x6 a1 = (-11*x6 + 18*x7 - 9*x8 + 2*x9 )/2 a2 = ( 2*x6 - 5*x7 + 4*x8 - x9 )*9/2 a3 = ( -x6 + 3*x7 - 3*x8 + x9 )*9/2 In terms of u: b3 = ( -x6 + 3*x7 - 3*x8 + x9 )*9/16 b2 = ( +x6 - x7 - x8 + x9 )*9/16 b1 = ( +x6 - 27*x7 + 27*x8 - x9 )/16 b0 = ( -x6 + 9*x7 + 9*x8 - x9 )/16 To approximate circle ... can't be done exactly. 1. break up curve into pieces. 2. approximate each piece with bezier, matching endpoints and slopes. For any small piece of the circle (arc), we can set the endpoints of the bezier to the endpoints of the arc, and the guide points along a line tangent to the circle... by symmetry, it seems best for both tangent lines to have the same length -- but how long ? Some ways to pick a length: * adjust so midpoint of bezier actually does intersect/touch/tangent the midpoint of the arc * adjust so *curvature* at endpoints is correct (this is more-or-less the "smooth" option of the tinyboot cubic approximation to sin) * adjust so *curvature* at midpoint is correct * adjust so -1,-1/3,+1/3,+1 points of curve all touch the circle (is it possible to do this *and* keep tangent at endpoints ? Alas, no -- this is the "more accurate, less smooth" option of the tinyboot cubic approximation to sin) .... adjust so midpoint of bezier actually does intersect/touch/tangent the midpoint of the arc: to approximate upper-right quarter-circle with center of circle at 0,0, use points p0 = r,0 p1 = r, r*(4/3)( 3*2^(1/2) - 1 ) =~= r, r*(0.55228) p2 = r*(4/3)( 3*2^(1/2) - 1 ), r p3 = 0,r to approximate bottom quarter-circle with center of circle at 0,0, use points p0 = -r / 2^(1/2), -r / 2^(1/2) p1 = , -r*( 4 - 1/2^(1/2) )/3 =~= , -r*(1.0976) ... recursive spline drawing: // draw spline ending at points p0 and p3, using guide points p1 and p2. drawspline( p0, p1, p2, p3){ // this works for p0 in 1D, 2D, 3D, etc. Are points "close enough together" ? then draw pixels / lines / parabolas. (Checking just p0 and p3 isn't enough -- sometimes p0 and p3 are on the *same* pixel, yet the guide points stretch the spline out into a big loop.) ... and return. Optional: Are p0, p1, p2, p3 close enough to a straight line ? Then draw a straight line and return. else: Calculate new midpoint pm: pm = p(t=1/2) = p(u=0) = ( p3 + 3*p2 + 3*p1 + p0 )/8. /* with different K, we could just do (p2 + p1)/2. */ Calculate offset of new guide points from new midpoint: d = ( p3 + p2 - p1 - p0 )/8 /* x'(t=1/2) = ( x3 + x2 - x1 - x0 )*3/4 */ calculate new guide point for right curve: p_right = pm + d /* x'(t=0) = ( x1 - x0 )*3, so x1 = x0 + x'(t=0)/3 */ Calculate new guide point for left curve: p_left = pm - d recursion: /* to limit recursion depth, do the *smallest* one first. */ drawspline( p0, (p0+p1)/2, p_left, pm); drawspline( pm, p_right, (p2+p3)/2, p3); } (How does this compare to the deCasteljau Algorithm ? Cox de Boor recursive algorithm ?) ----
Part I: Splines http://www.cg.inf.ethz.ch/~pauly/CGC/PartI.pdf instead defines cubic curves this way:
... x(t) = b0*(1-t)^3 + 3*b1*t*(1-t)^2 + 3*b2*t^2*(1-t) + b3*t^3 Coefficients b0,...,bn are called Bezier points or control points. Set of control points defines the so-called control polygon.
Properties of Bezier-Curves:
* Affine invariance: Affine transform of all points of on the curve can be computed by the affine transform of its control points.
* Convex hull property: The curve lies in the convex hull of its control polygon. ...
Keeps mentioning a "Java Applet". Where is it ?
---- "Explorations in Geometric Modeling: Parametric Curves and Surfaces" by Kenny Hoff COMP236 Final Project Spring 1996 http://www.cs.unc.edu/~hoff/projects/comp236/curves/ ... includes "Derivation of Incremental Forward-Difference Algorithm for Cubic Bezier Curves using the Taylor Series Expansion for Polynomial Approximation", "Derivation of 3D Rational Curves", and "C++ Source Code" (# Bezier curves * Direct evaluation using Bernstein polynomials (with coefficient lookup tables) * Recursive subdivision * Forward-differencing (using matrix simplification) ) (Cox de Boor recursive algorithm) ---- "De Boor's algorithm is a generalization of de Casteljau's algorithm. It provides a fast and numerically stable way for finding a point on a B-spline curve given a u in the domain." http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/de-Boor.html "B-spline curves share many important properties with Bézier curves, because the former is a generalization of the later." http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/bspline-curve-prop.html http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/
How to Draw a Circle on a Computer http://www.math.toronto.edu/mathnet/questionCorner/drawcircle.html ]
"the formula for finding the roots of any quadratic, cubic or quartic polynomial." http://forum.swarthmore.edu/dr.math/problems/rootsform.html
2002-12-04:DAV: Today I was looking for values of sin( 2 * pi / n )^2 that are rational for integer values of n. (Since sin(x)^2 = ( 1 - cos( 2*x ) ) / 2, these are also the values of cos( 4 * pi / n ) that are rational for integer values of n).
Here are all the rational results I found so far -- plus a few of the irrational ones.
I suspect these are all the values of sin( 2 * pi / n )^2 that are rational for all possible rational values of 1 < n.
I suspect these are the only rational values of sin( 2 * pi / n )^2 for any rational value of n.
n (sin( 2 * pi / n ))^2 1 0 2 0 3 3/4 4 1 5 (2 + phi)/4 6 3/4 7 228/373 + ... 8 1/2 9 69/167 + ... 10 (3 - phi)/4 12 1/4 15 22/133 + ... 16 (2 - sqrt( 2 )) / 4 20 (2 - phi)/4 24 (2 - sqrt( 3 )) / 4
where
phi = (1+sqrt(5))/2 = the golden ratio = [ 1; 1 1 1 1 1 1 1 ....]
-- David Cary
where a,b,c,d are all positive integers. (works with complex numbers ?)a/b
-a/b
PI*a/b
-PI*a/b
(a/b)*sqrt(a/b)
-(a/b)*sqrt(a/b)
ln(a/b)
exp(a/b) // DAV thinks this form can represent all complex numbers
-exp(a/b)
the romance that is mathematics. ... We think of a math whiz as someone who can do in his head what a calculator can do on silicon. But that is not math. That is accounting. Real math is not crunching numbers but contemplating them and the mystery of their connections. ...
... Mathematicians do not like to admit that, because when they do, ... they are suspected of just playing around, which of course they are.
... We have our means and ends reversed. What could be more important than divining the Absolute? ...
What higher calling can there be than searching for useless and beautiful truths? ...
... mathematics ... operates very close to religion, which is why numerology is important to so many faiths and why a sense of the transcendent is so keenly developed in many mathematicians, even the most irreligious. ... Says Erdös: “You don't have to believe in God, but you should believe in the Book.”
This site is dedicated to mathematical, historical and algorithmic aspects of some classical mathematical constants (like π e, the Euler constant γ, ζ(3), ...). A few results on prime numbers are added. Easy and fast programs are also included and can be downloaded.
...
e = [2;1,2,1,1,4,1,1,6,1,1,8,1,1,10,1,1,12,1,1,14,...] (continued fraction expansion)
√e = [1;1,1,1,5,1,1,9,1,1,13,1,1,17,1,1,21,1,1,25,...],
(e-1)/2 = [0;1,6,10,14,18,22,26,30,34,38,42,46,50,...]
exp(z) = 1 + z/1! + z^2/2! + z^3/3! + ... z^n/n! + ... \ converges for -inf < z < +inf
for small values of x,
exp(x) =~= (12 + 6*x + x^2)/(12 - 6*x + x^2) =~= (2+x)/(2-x)
Newton's method: xnew = x - f(x)/f'(x).
for finding x=a^(1/2):
f(x) = x^2 - a.
f'(x) = 2*x.
xnew = x - (x^2 - a)/(2*x) = x - x/2 + a/(2*x) =
xnew = ( x + a/x ) / 2.
this is taking the average of x and a/x, and since either
x < a^(1/2) < a/x, or
a/x < a^(1/2) < x,
the average is closer ...
http://numbers.computation.free.fr/Constants/Sqrt2/sqrt2.html and http://numbers.computation.free.fr/Constants/Algorithms/inverse.html mentions an even simpler method (avoids division):
for finding x = a^(-1/2):
f(x) = x^(-2) - a
f'(x) = (-2)*x^(-3)
xnew = x - (x^(-2) - a) / ( (-2)*x^(-3) ) =
= x + (x^(-2) - a) * (x^3/2) =
= x + (x - a*x^3)/2 =
xnew = x + x*( 1 - a*x^2 )/2.
To find y = b^(1/2) using this method, either
Newton's method: xnew = x - f(x)/f'(x).
for finding x = 1/a,
f(x) = 1/(a*x) - 1
f'(x) = (1/a)*(-1)*x^(-2) = -1/(a*x^2)
xnew = x - ( 1 / (a*x ) - 1 ) / ( -1/(a*x^2) ) =
= x + ( 1/(a*x) - 1 ) * ( a * x^2 ) =
= x + x - a*x^2 =
= 2*x - a*x^2 =
xnew = x + x*( 1 - a*x )
Normally Newton's method requires a division, but in this special case it only requires multiplication and addition. (no divides).
I've heard some early mainframe computers didn't have division built into the hardware, and used Newton's method to approximate it in software.
DAV: I remember deriving this in Calculus class ...
1.7320508... = sqrt(3) = [1; 1 2 1 2 1 2 1 2...]
1.732057.... = 362/209
1.73214..... = 97/56
1.7333...... = 26/15
p(n+1) = (A+B)*p(n) + 2*A*q(n)
q(n+1) = 2*B*p(n) + (A+B)*q(n)
then
lim(n --> inf) p(n)/q(n) = ( A / B ) ^ (1/2).
The closer A/B is to 1, the faster the speed of convergence.
DAV: range reduction: x^(1/2) = 2*(x/4)^(1/2) ...
(1+x)^(1/2) = 1 + x*1/2 - x^2*1/2*4 + x^3*1*3/2*4*6 - ...
bitcount.h (population count: counting the bits in a word) bit2pow.h (functions like floor(log2())) bitrotate.h (rotating words) revbin.h (reversing the order of bits in a word) graycode.h (graycode and parity) tinyfactors.h (determining whether d divides x for d,x<BITS_PER_LONG) greencode.h (gray code towards high bits) branchless.h (code that avoids branches)and some code dealing with shift register sequences (LFSR). [FIXME: ... binary cleverness ... video_game.html ?] [FIXME: many of these in branchless are almost identical to tricks on PIC ...]
From: Rob Warnock (rpw3@rigden.wpd.sgi.com) Subject: Re: Sound compression Newsgroups: comp.compression Date: 1991-05-15 14:01:22 PST In article <19524@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes: +--------------- | Sorry I don't have source any more, but the way to quickly find the | high one bit (on a 2's complement machine) is: | | h1bit = n & (-n) +--------------- Sorry, this is quite incorrect! This will find the *lowest*-order one bit set! To find the *highest*-order bit set (a.k.a. a "priority encoder") you need a find-highest-one function, or a loop, or a "comb". The find-highest function exists in some CPU instruction sets (e.g., the PDP-10's "JFFO", the Am29000's "CLZ") but not in others (e.g., the VAX's "FFS" finds the *lowest* bit). Many software implementations use some version of a binary tree search: bit_num_of_highest_one(unsigned int x) { int answer, i; if (x == 0) return -1; for (answer = 0, i = NBITSPERWORD/2; i; i >>= 1) { mask = (~0) << i; if (x & mask) { x >>= i; answer += i; } } return answer; } Often the loop is unrolled, since the wordlength is usually fixed for a given machine. A co-worker (Brendan Eich) first showed me the "comb" approach: bit_num_of_highest_one(unsigned int x) { int answer = 0; if (x == 0) return -1; if (x & 0xffff0000) answer += 16, x &= 0xffff0000; if (x & 0xff00ff00) answer += 8, x &= 0xff00ff00; if (x & 0xf0f0f0f0) answer += 4, x &= 0xf0f0f0f0; if (x & 0xcccccccc) answer += 2, x &= 0xcccccccc; if (x & 0xaaaaaaaa) answer += 1; return answer; } While this is cute, it's not necessarily any faster than an unrolled version of the "search" code above, since the 32-bit constants needed for the "comb" version tend to take two instructions to generate on current RISCs. By the way, there was a paper in CACM (??? sorry, forgot the ref) a number of years ago which proved that there is no closed functional form without conditionals which will compute "the highest-order one bit". -Rob ----- Rob Warnock, MS-1L/515 rpw3@sgi.com rpw3@pei.com Silicon Graphics, Inc. (415)335-1673 Protocol Engines, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311 ... the "leftmost bit" a.k.a. "binary logarithm" of an integer ... From: Steve Peltz (peltz@cerl.uiuc.edu) Subject: Re: Sound compression Newsgroups: comp.compression Date: 1991-05-22 16:24:19 PST In article <105539@sgi.sgi.com> rpw3@sgi.com (Rob Warnock) writes: >Let's re-open the question to others: What *other* fast, practical methods >are there to find the "leftmost bit" a.k.a. "binary logarithm" of an integer, >besides: Here's what is used on our system (Cyber 60 bit word one's complement; I'm translating to C). It requires a fast population count and shifting. #define blur(i,o) d = d | ((d = d | ((o) >> i)) >> (2 * i)) #define hibit(x) bitcnt(blur(16, blur(4, blur(1, d = x))) This also obviously depends on order of operation; let's see if I can unroll it a bit. /* blur(1, d=x) */ d = x; d |= d >> 1; d |= d >> 2; /* blur(4, ...) */ d |= d >> 4; d |= d >> 8; /* blur(16, ...) */ d |= d >> 16; d |= d >> 32; return bitcnt(d); d is a signed 60-bit value, so right shifting does shift in the sign bit, but that doesn't matter for the algorithm. I don't know if there is another origin for this algorithm, but it is listed in our library as having been written by Mike Huybenz, Feb '77. For a 32 bit value, the last shift isn't necessary. -- Steve Peltz Internet: peltz@cerl.uiuc.edu PLATO/NovaNET: peltz/s/cerl From: Doug Gwyn (gwyn@smoke.brl.mil) Subject: Re: Sound compression Newsgroups: comp.compression Date: 1991-05-16 11:57:46 PST In article <104045@sgi.sgi.com> rpw3@sgi.com (Rob Warnock) writes: >By the way, there was a paper in CACM (??? sorry, forgot the ref) a number >of years ago which proved that there is no closed functional form without >conditionals which will compute "the highest-order one bit". And of course that "theorem" is wrong: using the operand to index into a table can immediately produce the appropriate answer. For large word sizes or small machines, that will often be impractical, but it can serve as a basis for an implementation using smaller tables (which does require use of conditionals): int ffo( register unsigned long x ) { static int bitnum[256] = { -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7 }; if ( (x & 0xFF000000) != 0 ) return 24 + bitnum[x >> 24]; else if ( (x & 0x00FF0000) != 0 ) return 16 + bitnum[x >> 16]; else if ( (x & 0x0000FF00) != 0 ) return 8 + bitnum[x >> 8]; else return bitnum[x ]; } From: William E Davidsen (davidsen@rdsunx.crd.GE.COM) Subject: Re: Sound compression Newsgroups: comp.compression Date: 1991-05-17 10:21:56 PST In article <104045@sgi.sgi.com>, rpw3@rigden.wpd.sgi.com (Rob Warnock) writes: |> In article <19524@crdgw1.crd.ge.com> davidsen@crdos1.UUCP |> (bill davidsen) writes: |> +--------------- |> | Sorry I don't have source any more, but the way to quickly find the |> | high one bit (on a 2's complement machine) is: |> | |> | h1bit = n & (-n) |> +--------------- |> |> Sorry, this is quite incorrect! This will find the *lowest*-order one bit set! You obviously read what I said, not what I meant ;-) Yes, my note was woefully incorrect. It turned out that for the data I had it was quicker to find the low bit and discard repeatedly than to shift looking for the high. I should have said more or nothing. There was a discussion of this in comp.lang.c some few years ago, and for random data a binary search is faster. Sound data isn't random, at least not in the programming sense.
One way to implement max(a,b) from (Tony Kennedy) Oct 1993: max(x,limit) == max(x-limit,0) + limit and if we define "z > 0" as a word of all ones if z is greater than zero and a word of all zeros if it is not, then max(z,0) = z bitand (z > 0) "z > 0" is a word of all ones if z is greater than zero and a word of all zeros if it is not. In C terms this leads to the following function (for 32 long ints and floats with the sign bit at the left end) float max(float x, float y) { union fl {float f; long l;} z; z.f = x - y; z.l &= ~z.l >> 31; return z.f + y;} (Caveat: I have not directly checked the version given here).
[FIXME: convert to PIC assembly code -- or has someone already done this ? If so, point to that code.]
What's Special About This Number? If you know a distinctive fact about a number not listed here, please e-mail me.-- Erich Friedman http://www.stetson.edu/~efriedma/numbers.html
Another clever snippet from http://www.pobox.com/~qed/optimize.html [FIXME: binary bit fiddling]
before
#define abs(x) (((x)>0)?(x):-(x))after
static long abs(long x) { long y; y = x>>31; /* Not portable */ return (x^y)-y; } ... static int int_abs(int a) { return a - ((a+a) & (a>>31)); }
while http://www.jjj.de/bitwizardry/files/branchless.h uses
static inline ulong upos_abs_diff(ulong a, ulong b) // Return abs(a-b) // Both a and b must not have the most significant bit set { long d1 = b - a; long d2 = (d1 & (d1>>(BITS_PER_LONG-1)))<<1; return d1 - d2; // == (b - d) - (a + d); }
sin0 = sqrt0/2 = 0 sin30 = sqrt1/2 = 1/2 sin45 = sqrt2/2 sin60 = sqrt3/2 sin90 = sqrt4/2 = 1http://www.everything2.com/index.pl?node_id=916568
...
If an angle is an integral multiple of one second, then its sine and cosine are obtainable by simple multiplication from the formulas
sin(a + Δa) = sin a cos Δa + cos a sin Δa
sin(a - Δa) = sin a cos Δa - cos a sin Δaand
cos(a + Δa) = cos a cos Δa - sin a sin Δa
cos(a - Δa) = cos a cos Δa + sin a sin Δa[DAV: these are exactly true for any 2 angles a and Δa]
If the angle also contains fractions of a second, one must first interpolate with respect to the fractional part ... and only afterwards combine ... in accordance with the relevant trigonometric addition theorem.
...
The tangent of any angle can be obtained easily by division, by use of one or the other of the two formulas
tan( β + Δβ ) = ( cos 2Δβ - cos 2β ) / ( sin 2β - sin 2&Deltaβ ) = ( sin 2β + sin 2Δβ ) / ( cos 2β + cos 2&Deltaβ )
and
tan( β - Δβ ) = ( cos 2Δβ - cos 2β ) / ( sin 2β + sin 2&Deltaβ ) = ( sin 2β - sin 2Δβ ) / ( cos 2β + cos 2&Deltaβ )
[DAV: Are these are exactly true for any 2 angles β and &Deltaβ ?]
.
For computing the logarithm of a trigonometric function to sixteen or more decimal places, the most suitable procedure is as follows. First, the value of the trigonometric function is computed as above. Next, a value as close to it as possible and with at least six places is determined and its twenty-one place logarithm read off from the tables of Steinhauser or Callet ... Finally, this last is used in conjunction with the series
log(n+Δn) = log n + 2 M { Δn/( 2n + Δn ) + (1/3)*( Δn / (2n + Δn) )^3 + .... }
to obtain the desired logarithm. The convergence of the series, in this application, is extremely rapid. M denotes the modulus of the Briggsian (common) logarithm ...
M = log_10( e ) =~= 0.43429_44819_03251_827651_13
1/M = log_e( 10 ) =~= 2.30258_50929_94045_684017_99
...
For the converse case -- to determine the angle, a trigonometric function of the angle being given -- ... Whatever the given function may be, the first step is to find as good an approximation to the desired angle as possible with the aid of seven-place... tables of logarithms. If y is this approximation and if y + Δy is the desired angle, then Δy is to be determined from the relations
Δy = ( sin(y+Δy) - sin y ) / ( arci*( cos y - (1/2)Δy*arci*(sin y) )
Δy = ( cos y - cos(y+Δy) ) / ( arci*( sin y + (1/2)Δy*arci*(cos y) )
Δy = ( tan(y+Δy) - tan y ) / ( arci*( 1 + tan(y)*tan(y+Δy) )
Δy = ( cot y - cot(y+Δy) ) / ( arci*( 1 + cot(y)*cot(y+Δy) )
[DAV: arci = 0°0'1" * π/180 = π/(180 * 60 * 60) =~= 0.00000_48481_36811_095359_94, which is the conversion factor between radians and *seconds* of angle. ]
[DAV: Δy is in units of *seconds* of a degree of angle.]
[DAV: these are not exact equivalences, but an approximation for when Δy is small]
where the appropriate relation is to be chosen, depending on whether the given number, from which y + Δy is to be determined, was sin(y + Δy), cos(y + Δy), tan(y + Δy), or cot(y + Δy). Since the unknown Δy occurs in the denominators of the right-hand sides of the first two relations, Δy can be determined from these only by successive approximations; the term involving Δy in the denominator is taken to be zero to begin with, and a first approximation Δy_1 is computed under this assumtion; the equation is then solved with this value substituted in the denominator; and so forth, until the approximations remain stable.
...
For Table II (Twenty-One Place Tables of Sine and Cosine), all 540 entries were checked not only by means of the formulas
sin a = cos( 30° - a ) - cos( 30° + a )
but also, to 24 places, by means of differences up to and including those of order 9. ... When rounding off to 21 decimal places, I undertook fresh, and completely independent, computations of all those values that end in 499, 500, or 501 in the 24th decimal place, thus insuring the accuracy to within half a unit in the 21st decimal place of all entries in the two tables.
...
-- _Eight-place tables of trigonometric functions for every second of arc_ book by Jean Peters 1939 (William F. Cary bought a 1968 printing in 1972 Sept. 5 for $27.50)
DAV: since the slope = rise/run = derivative,
then the difference (rise) between 2 consecutive entries in the table should be the derivative times the run.
In other words, given a table of values
x | sin(x) | 1st difference | 2nd difference | 3rd difference
0
Δx
2*Δx
...
(Peters prints the difference in-between the 2 numbers that were subtracted, so the 1st difference is offset ... the 2nd difference is lined up with x and sin(x) ... the 3rd difference is offset, but lined up with the 1st difference).
then the 1st difference should be about
(d/dx)*sin(x)*Δx = cos(x)*Δx
and the 2nd difference should be about
(d/dx)^2*sin(x)*(Δx)^2 = -sin(x)*(Δx)^2
... If the table had been printed so that Delta was easy to read off (say we pick Delta of 100 (decimal) micro-radians, or print the table in hex and pick a Delta of 0.01(hex) radians, so there's 402 (dec) entries to cover the first quadrant (from 0 to π/2). ) then the digits would be *exactly* the same (just the decimal point shifted) in the 2nd difference column as in the original column for both the sin() and cos(); and the 1st difference column of the sin would be the same as the cos() column.
-- "The Tetragamma Function and Numerical Craftsmanship: MATLAB's special mathematical functions rely on skills from another era" article by Cleve Moler http://www.mathworks.com/company/newsletter/clevescorner/winter02_cleve.shtml ... discusses "argument reduction" and why it is necessary ... giving sin(), exp(), and ln() as examples....
A Web- and CD-based successor to A&S, the Digital Library of Mathematical Functions, is under development at the National Institute for Standards and Technology, see http://dlmf.nist.gov/
Much of A&S is devoted to tables that we don't need anymore, except to check the results computed by MATLAB and other mathematical software. But the algorithms we use in modern software for special mathematical functions are derived from techniques used in hand computation over 50 years ago. There are too few people today who know how to write robust software for computing these functions.
...
To appreciate the subtleties involved in the computation of special functions, consider a more basic and familiar function, sin(x). ...
The resulting algorithm for approximating sin(x), as well as other trig functions, exp(x) and log(x), is carefully implemented in fdlibm, a
Freely Distributable Math Library,developed at Sun Microsystems by K. C. Ng and others, (see www.netlib.org/fdlibm ). We use fdlibm in MATLAB for all the machines we support. This ensures uniform behavior, particularly for exceptional conditions....
This approach to computing sin(x)... carries over to other mathematical functions. The general outline is:
- Transform the argument to one or more intervals where the function can be accurately approximated.
- Evaluate a polynomial or rational approximation for the transformed argument.
- If necessary, adjust the result to account for the initial argument transformation.
...
The algorithms that MATLAB uses for gamma functions, Bessel functions, error functions, Airy functions, and the like are based on Fortran codes written 20 or 30 years ago by W. J. Cody at Argonne National Labs and Don Amos of Sandia National Labs.
...
... each and every micro family has its insider snippets. Short and sneaky code sequences that do utterly amazing things. In ways previously unthunk of. As another insider snippet, we looked at a PIC generating a high quality sinewave in an astonishing six bytes of code back in HACK85.PDF.
H /| / | Op ./______| T A
Given a right triangle (one right angle, another angle T) with 3 sides (hypotenuse H opposite the right triangle, opposite side Op opposite angle T, and adjacent side A between T and the right angle). Then we have:
cos T = A / H (plane geometry)
cos T = sin( A/r ) / sin( H/r ) (spherical geometry, where r = radius of earth)
Today 2004-03-20 I derived this from vector calculus ... In particular, if A is the length from the equator to the tropic of cancer ( A/r =~= 24° ), and T is the current latitude (for Tulsa, T =~= 36°), then at the top of a spherical or cyndrilical sundial, to completely catch the full range of sun positions at sunset, we need angle H/r from directly East (sunset at equinox) north-ish to sunset at winter solstice, angle H/r from directly East south-ish to sunset at summer solstice. (and symmetrically the same at sunrise on the W side of the sundial).
that angle H/r is
H/r = arcsin( sin( A/r) / cos T ).
In Tulsa, that angle is H/r =~= arcsin( sin( 24° ) / cos( 36° ) =~= 30.2°.
What an astonishing coincidence that it's almost exactly 1/6 π. So I can cut the southern 1/3 out of my cyndrilical sundial; the sun never casts a shadow in that direction (in Tulsa). [FIXME: sundial]
(Herf gives an example of point-by-point dividing 2 arrays c[] := a[] ./ b[] mentions that in double-precision, it's even better to do 3 at a time)
Suppose you want to compute: a/b AND c/d In the real number system, this is the same as: ad/bd AND cb/bd In case this isn't getting through, that's: inv := 1 / (b * d) inv * a * d AND inv * b * d
... Normalizing a vector invariably requires dividing by the square root of the magnitude, which is equivalent to multiplying by the inverse sqrt: y = inverse_sqrt( x ) = 1/(sqrt(x)) = x^(-1/2) . With that in mind, here is some fast code to compute the inverse square root ... function that returns the sine and cosine of a single value at once. ... sincos(theta) ... In a nutshell, the most useful application for the arctangent function is to convert between rectangular and polar coordinates. To be more specific, starting with the location (x, y), we want to find (theta, r) in polar coordinates. ... cycles are being thrown away here. The calculation of the arctangent automatically generates the square root as a by-product, so why not use it? atan2r() ... given (x,y), calculates (theta, r) all at once.
1 2 3 A 697 Hz 4 5 6 B 770 Hz 7 8 9 C 852 Hz * 0 # D 941 Hz | | | | | | | 1633 Hz | | 1477 Hz | 1336 Hz 1209 Hz-- http://en.wikipedia.org/wiki/DTMF
In each bank, we have
0x00 ... : SFRs
0x10 ... : SFRs
0x20 ... : RAM
0x30 ... : RAM
0x40 ... : RAM
0x50 ... : RAM
0x60 ... : RAM
0x70 ... 0x7F : shared RAM
The biggest contiguously-addressible block is 96 bytes.
; offset of byte (0..95) in array is in "offset". ; Set up FSR so we can read and write that byte through INDF. ; array runs from array_start to array_start+95 in a single bank. ; array_start must be at least 0x20 to avoid hitting SFRs. ; maximum contiguous array starts at either 0x20, 0xA0, or 0x120. array_start EQU 0x120 ; set up bank select bit IRP BANKISEL array_start ; copy offset, into W movf offset,w ; tweak to keep it in general RAM, avoid pointing to SFRs addlw array_start ; if offset is too big, might overflow buffer into SRF -- oops! movwf FSR ; now ready to read or write the byte at that offset using INDF. ; 4 instructions, 4 cycles ...
What if we want an even bigger array?
With a little bit of bit-twiddling, we can split up an array into all 3 banks. The largest "simple to address" RAM array seems to be 3*64 bytes = 192 bytes (on chips that implement "bank 3", such as the PIC16F88, this method gives 4*64 = 256 bytes).
Given a byte index the quickest way I know of to convert that index into a pointer and read from it is: (byte index goes from 0 to 255 on PIC16F88 and PIC16F877 and PIC16F876 ... on PIC16F688, byte index 0 to 192 ( 0 to 0xBF ) access real RAM, the rest access "unimplemented" and read zero).
; offset of byte (0.. 255) (actually 0..191) in array is in "offset". ; Set up FSR so we can read and write that byte through INDF. ; (The array is stored in banks 0,2,1,(3) rather than consecutive 0,1,2,(3) ; to make the code faster/smaller). ; array runs from array_start to array_start+0x3F in every bank. ; array_start must be at least 0x20 to avoid hitting SFRs. ; array start must be at most 0x30 to avoid hitting the shared RAM. array_start EQU 0x20; any number from 0x20 to 0x30 is OK. ; copy bit 6 of offset to bank select bit IRP bcf STATUS, IRP btfsc offset, 6 bsf STATUS, IRP ; copy rest of offset, except for bit 6, into W movlw 0xBF ; ~(1 << 6 ) andwf offset,w ; (If offset too big, we end up pointing into) ; ( "unimplemented -- reads as zero" in "bank 4". Harmless. ) ; tweak to keep it in general RAM, avoid pointing to SFRs addlw array_start movwf FSR ; now ready to read or write the byte at that offset using INDF. ; 7 instructions, 7 cycles ...
If you want an array bigger than 192 bytes, I suspect you'll want to upgrade to a processor bigger than the 'F688, instead of fiddling around with more complex ways to fill up the 256 bytes of RAM.
; offset of byte (0..0x7F)(0..127) in array is in "offset". ; Set up FSR so we can read and write that byte through INDF. ; array runs from ; array_start to array_start+0x3F in bank 0, ; array_start+0x80 to array_start+0x80+0x3F in bank 1. ; array_start must be at least 0x20 to avoid hitting SFRs. ; array start must be at most 0x30 to avoid hitting the shared RAM. array_start EQU 0x20; any number from 0x20 to 0x30 is OK. ; force bank 0 or 1 BANKISEL array_start; bcf STATUS, IRP movf offset,w addlw array_start ; if too big to fit in bank 0, fit "second half" into bank 1 btfsc offset,6 addlw 0x40 ; skip over end of bank 0 and SFRs in bank 1 movwf FSR ; now ready to read or write the byte at that offset using INDF. ; 6 instructions, 6 cycles ...
2002-07-25:DAV: David Cary started.
errors
,
bug reports to
Return to index // end http://david.carybros.com/html/1d_design.html