584 Pages

## C++ version Edit

Hi Pce3. I've done a direct translation of your VB code into C++ code, for efficiency and portability, and comparison to other implementations written in C and C++. It compiles and runs. Thought you might find it helpful. I also list the results I got after compiling with g++ with maximum optimization on and running.

I have a copy of Microsoft C++ Version 6 Introductory Edition which will not compile this code. It seems to have a problem with what it called "a redefinition of k." Once k was removed from the include statement that error went away but then I got other warnings that 1.) main should return a value and 2.) n is an undefined local variable. I probably just need to upgrade to a standard version of C++. I knew the completion time in C++ would be unreal and figured no one would believe even the Visual Basic time. Guess few people ever immagined that a sort routine could be so simple yet so fast so no one ever actually considerd testing the possibility including Seward. It takes so little time to check all of the address space for a non-zero number that other sorts which appear to be highly efficient in theory are really victims of the power of brute force and just don't know what they are missing. Thanks for the code anyway. Maybe I can find a standard version of C++ on eBay where this introductory version came from for what I though was a reasonable price. imho 02:05, 8 June 2006 (UTC)
Yeah, VC++6 had problems with the scope of for loop initializers - I should probably avoid that construct. I've fixed the program to make it acceptable to that compiler. I must admit, this variation is especially useful for simultaneous sorting and duplicate elimination in a list of limited range - this has application in areas such as the database operation UNION. Deco 06:41, 8 June 2006 (UTC)
Your changes eliminated the "k" issue but the MSVC++v6 compiler still does not like what is calls an "unknown" character of value "0xff". I have also replaced "botnum" with "j" in the following line of code:
cout << "1.) Total number of items sorted:...................... " << j << endl;
and deleted the following line of code:
"const long elements = botnum;"
since it was only used in another configuration, is only an unseen carry over and is now extraneous. imho 08:05, 8 June 2006 (UTC)
• With the following line removed the compile is clean, however, execution still hangs over the fact that this is only an introductory version of MSC++ so I will definitely need to get another compiler.
cout << "4.) Total length of key:............................... " << (k == 0 ? 0 : (int)ceil(log((double)k)/log(10.0))) << " decimal digits." << endl;
imho 08:16, 8 June 2006 (UTC)

After downloading a copy of MS C++ 2005 Express I did a command line compile that was clean but the executable still brings up the debug window which does not call up the debugger in 2005. Looks like I'll have to try something else. Any suggestions? imho 03:01, 9 June 2006 (UTC)

• It appears that the errors were due mostly to the difficulty of MS C++ v6 and MX C++ Express 2005 to increment reallocate memory withing the loop so I have moved reallocation outside the loop and when I get more time will put a copy back inside the loop and test whether it first needs to be reallocated before being reallocated within the loop. Other minor changes were to designate constant literals. Code should now run on whatever. Thanks. imho 16:20, 12 June 2006 (UTC)
• Windows does not like reallocation within the loop so leaving it outside as a pre-allocation at maximum size. Leaving comments in and resetting assignments. imho 16:54, 12 June 2006 (UTC)

### checksort.cpp Edit

#include <iostream>
#include <fstream>
#include <time.h>
#include <stdlib.h>
#include <math.h>

using namespace std;

inline double Timer() { return ((double)clock())/CLOCKS_PER_SEC; }
inline double Rnd()   { return ((double)rand())/RAND_MAX; }

int main() {
cout << "Please wait for the RESULTS: (Check sort) is working... " << endl;
long* a = 0L;
long* b = 0L;
double start1, start, finish2, finish1, finish;
const int one = 1L;
const int zero = 0L;
const long maxi = 10000000L;
const long topnum = maxi;
const long botnum = one;
long i = 0L, n = 0L, j = 0L, k = maxi;
ofstream out("checksort.txt");
out << endl;
out.flush();
//	k = maxi;
delete[] a;
delete[] b;
a = new long[k];
b = new long[k];
start1 = Timer();
for(k=botnum; k <= topnum; k += topnum/100) {
start = Timer();
//	delete[] a;
//	delete[] b;
//	a = new long[k]; // Windows does not like reallocation within the loop
//	b = new long[k];
for (i=botnum; i <= k; i++) {
n = ((int)(k * Rnd())) + botnum;
a[n] = one; // (check sort is a(n) = one)
}
finish1 = Timer() - start;
start = Timer();
j = 0L;
for (n = botnum; n <= k; n++) {
if (a[n] > zero) { b[j] = n; j = j + one; }
}
finish2 = Timer() - start;
finish = Timer() - start1;
out << k << "\t" << finish1 << "\t" << finish2 << endl;
out.flush();
}
out.close();
cout << endl;
cout << endl;
cout << "1.) Total number of items sorted:...................... " << j << endl;
cout << "2.) Total primary array size to accomodate the sort:... " << maxi << endl;
cout << "3.) Total secondary array size to accomodate the sort:. " << j << endl;
cout << "4.) Total length of key:............................... " << (k == 0 ? 0 : (int)ceil(log((double)k)/log(10.0))) << " decimal digits." << endl;
cout << "5.) Total time for setup per 100,000 items:............ " << finish1 << " seconds." << endl;
cout << "6.) Total time for sort per 100,000 items:............. " << finish2 << " seconds." << endl;
cout << "7.) Total time for program:............................ " << finish << " seconds." << endl;
cout << endl;
cout << endl;
/*
for (i=1; i<=j; i++) {
cout << i << b[i] << endl;
}
*/
return 0;
}


### Console output Edit



RESULTS: (Check sort)

1.) Total number of items sorted:...................... 1542936
2.) Total primary array size to accomodate the sort:... 10000000
3.) Total secondary array size to accomodate the sort:. 1542936
4.) Total length of key:............................... 8 decimal digits.
5.) Total time for setup per 100,000 items:............ 1.109 seconds.
6.) Total time for sort per 100,000 items:............. 0.078 seconds.
7.) Total time for program:............................ 59.14 seconds.

Press any key to continue



### checksort.txt Edit


1	0	0
100001	0.015	0
200001	0	0.016
300001	0.016	0
400001	0.046	0
500001	0.047	0
600001	0.062	0
700001	0.079	0
800001	0.078	0.015
900001	0.094	0
1000001	0.109	0.016
1100001	0.109	0.016
1200001	0.125	0.016
1300001	0.14	0
1400001	0.157	0.015
1500001	0.156	0.016
1600001	0.172	0.016
1700001	0.187	0
1800001	0.203	0.016
1900001	0.203	0.016
2000001	0.218	0.016
2100001	0.234	0.016
2200001	0.234	0.016
2300001	0.266	0.015
2400001	0.266	0.015
2500001	0.266	0.016
2600001	0.297	0.015
2700001	0.297	0.016
2800001	0.312	0.016
2900001	0.328	0.016
3000001	0.343	0.016
3100001	0.344	0.015
3200001	0.36	0.015
3300001	0.375	0.016
3400001	0.375	0.015
3500001	0.391	0.031
3600001	0.391	0.031
3700001	0.406	0.031
3800001	0.422	0.016
3900001	0.437	0.032
4000001	0.453	0.015
4100001	0.469	0.031
4200001	0.454	0.031
4300001	0.484	0.031
4400001	0.485	0.031
4500001	0.5	0.031
4600001	0.516	0.031
4700001	0.531	0.032
4800001	0.531	0.031
4900001	0.547	0.047
5000001	0.547	0.031
5100001	0.578	0.031
5200001	0.594	0.031
5300001	0.594	0.031
5400001	0.61	0.031
5500001	0.625	0.031
5600001	0.625	0.047
5700001	0.641	0.031
5800001	0.656	0.032
5900001	0.671	0.032
6000001	0.672	0.046
6100001	0.719	0.047
6200001	0.703	0.031
6300001	0.704	0.046
6400001	0.719	0.047
6500001	0.734	0.047
6600001	0.735	0.047
6700001	0.75	0.046
6800001	0.766	0.047
6900001	0.781	0.047
7000001	0.797	0.047
7100001	0.797	0.047
7200001	0.812	0.047
7300001	0.828	0.047
7400001	0.844	0.047
7500001	0.843	0.047
7600001	0.86	0.046
7700001	0.875	0.047
7800001	0.891	0.047
7900001	0.89	0.063
8000001	0.891	0.062
8100001	0.906	0.063
8200001	0.922	0.062
8300001	0.922	0.063
8400001	0.953	0.062
8500001	0.953	0.063
8600001	0.969	0.046
8700001	0.985	0.062
8800001	1	0.063
8900001	1	0.062
9000001	1.016	0.062
9100001	1.188	0.062
9200001	1.047	0.063
9300001	1.047	0.062
9400001	1.063	0.062
9500001	1.078	0.063
9600001	1.094	0.062
9700001	1.094	0.062
9800001	1.125	0.063
9900001	1.109	0.078



## tally sort vs check sort Edit

This article claims that "neither the Check sort or the Rapid sort should be confused with the Counting sort".

Alas, I am confused. What is the difference between this "Check sort" and the tally sort? What is the difference between this "Rapid sort" and the counting sort?

The only significant difference I can see in the code is in how the min and max value expected in the input is determined.

The implementation I see at Wikipedia of counting sort includes a first pass of reading through all input values to find the max and min input value. The code I see here for the Check sort skips that first pass, and guesses that the min input value is at least 1, and the max input value is at most 10000000.

After setting up the min and max, I see no significant difference between this implementation of check sort and that implementation of tally sort.

There is one teensy difference: this implementation of check sort has a buffer overflow if any input value is outside the guessed range.

Is there some other significant difference that I'm overlooking?

--DavidCary 17:00, 12 September 2008 (UTC)

Note the date of this post since it is after the date of the posts below...
In regard to the difference between the counting sort (not the recently added "Tally" sort. Find an older (prior to 1996) printed description of the counting sort and post it here for me to critique. My recollection is that in addition to the need to find minimum and maximum values there were other differences. However, in terms of increasing speed, a search for minimum and maximum values in the unsorted data is an unnecessary waste of time. All other things being the same between a counting sort and the Check sort or the Rapid sort, including the limit imposed by the amount of available memory to create an array, the difference is in faster completion of sorting the data which is afforded by dropping the step of finding the minimum and maximum values in the unsorted data. If there are no other differences between the counting sort and the Check sort or the rapid sort then I would agree that they are variations of the counting sort. The name "Tally" sort, however, is not used by Bentley nor any association with the counting sort stated and so it sounds like an arbitrary name based on Bentley's description of the Check sort since he provided no other reference and the Check sort was first published here in 1996. imho 17:17, 14 September 2008 (UTC)

### National Institute of Standards and TechnologyEdit

The National Institute of Standards and Technology lists the original description of both the Rapid sort and the Counting sort.

Some of the most recent online descriptions of the Counting sort, however, seem to be using the description of the Rapid sort. This may be the point of confusion. imho 22:33, 14 September 2008 (UTC)

### earlier repliesEdit

I don't know. I dare to speculate. When I originally published the Check sort in the Wikipedia I was not really aware of what the concept of original research meant. I was really angry when they wanted to delete it. So much so in fact being angry got my user name banned. I have a little better understanding but the claim that the Wikipedia is an online encyclopedia that anyone can edit still has me a little confused. I thought the concept of original research applied only to things that were not self-evident. Anyway that is why it is here. It was banned from the Wikipedia as original research and ironically I think it was User:Deco who sponsored the nomination for deletion.
I was hoping to convince him to do a peer review so that I might one day be able to return it to the Wikipedia. Ironically he did in fact help me with it here by converting the Visual Basic code to C++ so I could see just how much faster it would run under c++. That study is published here as well. I dare think that he came up with the name "tally" and decided that instead of telling me about it or doing a peer review that he would just claim it as a previously unknown version of the counting sort. I've called it the "Simple" sort but no matter what you call it, it still does the same thing.
I thought his user name change was a bit odd but dismissed it because there is a sports figure named DECO. However, though recently instead of agreeing to do a peer review he did a 180 on me. I figured that he was having some kind of personal problem so I decided not to contact him again and to leave him alone. He was really upset that I called him Deco instead of by his new user name User:Dcoetzee. Hi Deco
The sort looks like it was posted in the counting sort article by an IP address instead of a user name. I'm sad and upset about this. User:Deco must be going through some awful time in his life. The C++ code he wrote was extremely beneficial and helpful to me in providing verification of just how fast a sort it is, despite the fact that it has to go sequentially through the entire array checking each and every indexed variable in the second step to see if there are any content values greater than zero.
Dang. I hate this worse than anything. I want to help User:Deco out of whatever problem he is having instead of going to the Wikipedia administration until he has had time to deal with this himself. Do you have any suggestions how to proceed? imho 01:40, 13 September 2008 (UTC)

Okay, I have found "tally" sort by doing a Google search here referenced to "Programming Pearls" by Jon Louis Bentley. I'll have to find a copy and look through it to see if the method it describes for a tally sort is the same as for the Check sort or if it is different. His first publication was in 1976. I was at Chapel Hill the summer of Woodstock (1969) taking a course in Chemistry but obtained an Associates Degree in Science in Business Applications Programming elsewhere in 1972 before failing to find a professor interested in multi-valued logical equation reduction. (Dr. Kleene may have been who I was looking for) Anyway this (1972) is when I began researching logic which resulted in this. My first recall of creating the Check sort is not earlier than 1989 and most likely by at least 1996 when I published it on the web and claimed it was the World's fastest sort routine. At any rate Bentley's book deserves a look-see. The Rapid sort is ideal for use in counting multiset clusters so it is quite possible that I may not have been the first person to use or to discover it. (Incidentally the last name of a programmer at the school I graduated from was Tally. He left and went to work for Disney World ~ 1972.) imho 02:56, 13 September 2008 (UTC)

As noted below although the year 2000, second edition, online publication contains the sort routine is does not call it by any name, much less calling it the "Tally" sort. imho 23:04, 13 September 2008 (UTC)

### "Programming Pearls"Edit

Okay, found an online copy here. On page six Bentley does in fact describe what appears to be the identical of the Check sort. The key to understanding this is what he calls Phase 3 where the index of the array is printed out based on the contents of the indexed variable. I'm still looking for the date of the second edition and whether or not the algorithm is contained in the first addition as well as it's date of publication. imho 03:39, 13 September 2008 (UTC)

Okay, date of publication appears to be 2000.
Programming Pearls
By Jon Louis Bentley, Jon Bentley
ISBN 0201657880, 9780201657883
239 pages
The last date of modification of my essay, "Chicken and the Egg - Which Came First?" is October 28, 1998, 7:39:50 PM, on my GTE user site named PatShaw. I do not know if this date can be verified by Verizon.
I have found the html web page for the Rapid sort routine. It has a copyright date of 1996. imho 04:04, 13 September 2008 (UTC)

### rapid.html c 1996Edit

My thoughts on this...

What I was talking about is that my original purpose was to minimize code in hopes of improving speed and this is what actually happened like for instance eliminating a loop cycle if two arrays could be used instead of one and it is this that resulted in finally ending up with the World's fastest sort routine. I remember doing this in 1995 or 1996 because of the house I was living in at the time. I remember that I had finally reduced the code to the point that it could not be reduced anymore and this is what resulted in the Rapid sort routine not some scientifically thought out approach or need like sorting sets or multisets or even counting multiset clusters. My "thing" was logical equation reduction and I applied it to the Shell-Metzner sort with the idea of sacrificing memory and cycles to gain speed. Think about it... all of the cool sorts around center on the idea of cycling through an array many times to keep memory requirements light and therefore sacrifice time to save memory. I figured, humm... why not sacrifice memory and cycles to save time?

Okay, so in any event what is important here are the dates of publication. Bentley's date of publication is 2000 and mine is 1996. Verizon should be able to verify this. Another point is that Bentley could have come up with this idea on his own but I don't think so even though his presentation sounds much more computer scientific than mine. I think he took my idea and then made it sound scientific so as to claim it for his own. However, what betrays it not being his own is 1.) the title of his book. I mean, "Programming Pearls." and 2.) notice how cleverly he left out the name "tally" sort. I mean where did that name come from? Certainly not from him. Its not in his book. He just failed to give me credit because I published the method online and he could not find it anywhere already published in ink. imho 05:30, 13 September 2008 (UTC)

### original unenumerated editionEdit

Okay, I have found an original unenumerated edition of "Programming Pearls" published in 1986 through my local library that should be here by next week which I can examine to see if it contains the Check sort routine. If it does then I am not the first person to discover, recognize or to develop this sort method. The next edition is called the first edition and was not published until 1999 followed by the second edition in 2000 which my copyrighted online publication proceeds by 3 or 4 years. I'll post what I find here as soon as I've had a look at the original edition that is an unenumerated edition. imho 23:14, 13 September 2008 (UTC)

In fact since the Check sort and the Rapid sort serve so perfectly to count and sort set and multisets as in this example count multisets I doubted that this method had not previously been discovered but I was not able to find one. Therefore I still seek to find any reference with a publication date prior to 1996 even though I, myself may have discovered this method when first using computers to study logic as early as 1963 or especially later during modification of the Harvard Chart Method starting in 1978 to see if I could get it to reduce multiple states (as in multi-valued logic) as noted here. imho 11:15, 14 September 2008 (UTC)

## check sort vs counting sort Edit

I am not familiar with the term "class L sort" -- could someone please link that term when it first appears in this article to some page that defines it?

DONE. Class refers to the type of performance derived from the plot of the ratio of time in seconds to number of items sorted. If the ratio is constant the curve will be straight or linear, hence a Class L sort routine is one with a linear performance curve. imho 17:17, 18 September 2008 (UTC)

A brief mention of the asymptotic behavior (using wikipedia: big O notation) would be a nice addition to this article.

The Class L notation now includes the following: Where $f(n) = \frac{t}{n}$ or f of n is theta g of n in Big O notation. Also I have added links on the first page to the performance analysis pages where use of class notation versus Big O notation is explained in more detail. imho 17:56, 18 September 2008 (UTC)

After reading NIST: "rapid sort" and NIST: "counting sort", I think I now understand the difference between them.

The main functional difference is that this "check sort"/"rapid sort" "only sorts keys. In other words, items to be sorted consist solely of the key; there is no additional data in items." On the other hand, the NIST "counting sort" can sort items that, in addition to the key, also have arbitrary data associated with each item.

As far as I can tell, the "Check sort" described on this page is the same as the NIST "rapid sort", and is therefore different from the NIST "counting sort".

I see no reason to doubt imho's claim that he invented the "check sort"/"rapid sort" algorithm. But I also suppose it is possible that more than one person independently discovered this sort.

My perspective is that most endeavors like developing a sort routine are highly focused on the application. Therefore a method such as the counting sort who's creator's goal was to sort alphanumeric data may have not recognized the potential of the key and the data being one since alphanumeric data, when converted to a numeric data/key, requires a 26 base word and requires a memory capacity of $26^6 = 308,915,776$ integers for only 7 places. Because my goal was to maximize speed the data/key offered an advantage rather than a disadvantage and from that point I had no desire to reduce speed by assigning alphanumeric data to the key, sorting the key, and then re-associating the data with the key. However, since Check/Rapid is so applicable to sorting sets/multisets it is a wonderment that this application did not spur the creation of the Check/Rapid sort method in the days when set theory development was the rage. imho 17:17, 18 September 2008 (UTC)

Should this "article" mention closely related sorting algorithms -- perhaps even "METHOD AND SYSTEM FOR SORTING WITHOUT COMPARATOR (1971) -- and briefly describe what distinguishes this method from those other methods?

--DavidCary 03:54, 18 September 2008 (UTC)

### Robbins' patentEdit

Okay, looking at the patent description image drawings and text it is clear that Robbins' April 6, 1954 patent number 2674733 is a hardware version of the radix sort in which the least significant binary digit is used to separate the elements into groups respective of their binary values, followed by use of the next most significant digit within each group to create sub-groups and so on for each digit in the element. Dirks, et al. January 30, 1973 Patent number 3,714,634 appears to perform the same radix sort covered in Robbins' earlier patent.
The hardware method I call the "Instant sort" simply treats the data as an address and routes the data to the address bus of a memory chip and then increments whatever value is stored in the location of memory, which is addressed by the data, as the first step. To retrieve the data in sorted order requires only that the memory contents be checked for a value greater than zero as the memory addresses are incremented starting from zero and listing or printing the memory address whenever a value in memory is found to be greater than zero. The count at the memory location is then decremented and this sub-step repeated until the memory contents are zero, and then continuing the sequential stepping through memory until all memory locations have been queried for values greater than zero. imho 11:50, 19 September 2008 (UTC)
However, make note that even though the patent title claims the method to be a "METHOD AND SYSTEM FOR SORTING WITHOUT COMPARATOR (1971) that the output from an AND or an OR gate which it relies upon does in effect compare the differences in input line values or states to determine the resulting output just as a person in the dark compares their memory of light and darkness to the current condition of lightness or darkness to determine whether it is light or dark. The comparison of states is simply hardwired into the logic gate.imho 18:53, 19 September 2008 (UTC) --->

### Programming Pearls (1986)Edit

Okay, I just picked up the 1986 addition of "Programming Pearls" from the local library inter-library loan system and both the date of publication and the dates of prior check outs confirms the sort routine I named the "Check sort" was indeed published prior to my publication in 1996. However, Bentley provides no name or reference for the sort routine he shows on page six but mentions that it came from his "Programming Pearls" column in the Communications of the Association for Computing Machinery with publication history in the introductions to Parts I,II and III. While many publications may be viewed online some are not. Trying to pin down whether a method to count multisets has been previously described may take a much longer period of time than anticipated so an answer to this question is not expected to come soon. Bentley's approach is to use a string of 27,000 bits with the position of each bit representing the presence or absence of an integer in the file equal to the positional count of the bit in the string. A bit is "on" only if an integer is in the file and "off" if it is not. The integer apparently represents a voting district. He states that the input is from a small range, that it has no duplicates and no data is associated with the record beyond the single integer. This is followed by an example which uses an integer array named "Bit" and the sorting method is identical to the Check sort with the exception that the Bit array is cleared by loading it with zeros instead of using a "Dim" statement or a "Redim" statement. I concede that the Check sort is therefore not the first implementation of the method to be described. However, this leaves the Rapid sort which uses a byte or word array rather than a bit array so as to be able to count duplicates which allows the method to be used to count multisets, again with no data associated with the record beyond the single integer as also distinguishes the counting sort from either of these methods. What of course must be done here is to pin down the dates of each development precisely so that a determination can be made as to which method is an actual variation of another method versus being simply a logical variation,i.e., to determine if development of the counting sort proceeded the other methods. An interesting side note is that in Column 8 (chapter 8) where Bentley describes another "Pearl", he uses an equals sign to show equivalence rather than accumulation. Although the formula looks identical to the multiset counting function it is clearly not. However, I am quite happy nonetheless to have found the perfect application for the Rapid sort routine in counting multisets as required by the method of Optimal Classification such that the benefit of this application makes first description for any other application interesting, but of not consequence, to say the least. imho 22:51, 19 September 2008 (UTC)

### Seward, Harold H. 1954 MIT Master's ThesisEdit

It appears that Seward claims to be the originator of both the Counting and the Radix sort. Since Robbin's hardware version of the Radix sort was patented in April 6, 1954[1] without mention of Seward I'm curious if I can pin down the month of publication of Seward's 1954 MIT Master's Thesis and determine if it does or does not describe the Counting sort or any variation that might duplicate the Rapid sort function. imho 00:08, 24 September 2008 (UTC)

#### Date FoundEdit

Information was found online at the:

Computer History Museum
1401 N Shoreline Blvd.
Mountain View, CA 94043
 Details Type Text Title Project Whirlwind R-232 Information sorting in the Application of Electronic Digital Computers to Business Operations Author Seward, Harold H. Publisher Massachusetts Institute of Technology (MIT) Electronic Systems Laboratory Date 5/24/1954 Category Technical report Collection Title MIT Computing Projects Series Title Hardware: Technical Reports/Notes - Whirlwind Accession Number 102652827

Since Seward claims to have originated both the Radix and the Counting sort yet his dates of publication come after Robbins' more information is needed to determine if a hardware version of the Counting sort was also produced by Robbins. imho 00:48, 24 September 2008 (UTC)

Donald E Knuth[2] says the Distribution Counting sort was first mentioned in print in 1956 by E.H. Friend but was first developed by Seward in 1954 for use with his Radix sort and was also published under the name "Mathsort" by W. Feurzig in 1960. He makes no mention of Robbin's hardware version and patent of the Radix sort filed in 1952. My guess is that depending on the purpose and application anyone might derive this sort and claim themselves as the originator without even osmosis being at play. Knuth eludes to the fact that it might have been around long before it was published and since even Hollerith had a sort machine it might have been known by him. imho 01:12, 2 October 2008 (UTC)

### notesEdit

1. Patent number 2674733
2. "Sorting and Searching" Vol. 3, Page 79, ISBN 0-201-03803-x, LCCC # 67-26020