The recently discovered NoChop algorithm, with its' STree (sparse-tree) data-structure, uniquely enables this logic-chain:

Goal: Sort, retrieve and maintain data significantly faster and efficiently than now
Problem: A speed-limit existed, controlled by n log(n) where n is the number of keys
Solution: Break this speed-limit, and set a new one controlled by n/8 log(n)
Mechanism: Exploit hidden structure in the keys to achieve this 8-fold improvement

Breaking the sound-barrier is analogous to breaking the n log(n) limit. A new engine and airframe were needed to break the sound-barrier. Similarly, a new algorithm (NoChop) and data-structure (STree) is needed to break the n log(n) barrier.

The critical understanding to take from the accompanying diagram is that legacy algorithms (BTree etc.) all have a maximum of 2 branches per processing-node, compared to NoChop's 256. This is overhead-free (no "chop" or partitioning is needed) and results in up to 8 times less processing-levels (2 ^ 8 = 256 = the number of ascii characters), with a corresponding 8-fold speed-limit increase.

A complimentary way of understanding the NoChop algorithm is that it first builds an index (the STree) of a set of records in super-fast time. For sorting records it then reads the entire index in ascending order to deliver the sorted-records, again in super-fast time. For retrieving and maintaining data it only needs to read a single index entry.

STree stores nodes as row-pointers in a sparse-matrix of n+ rows and 256 columns, and branches (or "junk-DNA") in a string-array of n+ rows. This "junk-DNA" usually comes at the end of keys (e.g. "ffalo" in "Buffalo"), but can occur anywhere (e.g. "est " in "West Carson" and "West Chester"). If a row-pointer is non-zero its column-position represents the ascii code concerned (e.g. a 123 in column 102 means "B"s are located in row 123, and a 4567 in column 165 means "u"s are located in row 4567).

This STree dual data-structure (unlike hashing) enables searching for both exact record matches (e.g. = "Buffalo") and next/previous record matches (e.g. > "Buffalo", >= "Buffalo", < "Buffalo" and <= "Buffalo").

As every algorithm has a different processing-overhead at each level, the best comparison method is to "race" them (i.e. follow an empirical approach to re-enforce the theoretical one). This is exactly what the subsequent web-pages explore, with a NoChop supremacy +50% over QuickSort, +350% over BinaryChop and +100% over BTree.

There are detailed white-papers, working examples with test-data, and the computer-code can be examined. Additional test-data can easily be added by the user. Smoothing is used to average-out background-processing. Access to all these materials is via the Downloads web-page.

Developers and researchers are encouraged to build on the above. The initial coding and the comparison class-library algorithms are written in the VBA programming language due to its' ability to easily-integrate with front-end tools such as spreadsheets which helps aid knowledge-transfer and education. The next steps would be to replicate-and-improve upon these results in other more "industrial" languages such as Python and C, and then to expand into faster implementation ones such as Go, and eventually to assembler and machine-code. New class-libraries for NoChop/STree could then be built using these advanced languages.

The STree data-structure is very amenable to massive parallelization, the simplest design having 256 processors. Simple triaging would assign each key to its appropriate core based on its first ascii character. For sorting data, the result could be read-off with zero latency as soon as the last key was processed. For retrieving and maintaining data an in-memory model would provide a near instantaneous response.

NoChop is technically "stable" with no conditions that make it run slowly. This is unlike other current algorithms (e.g. QuickSort pivots and BTree re-balancing).

As NoChop achieves its' supremacy by performing less operations than its' legacy competitors, not more, it should generate a green-dividend in-line with the performance-dividend (e.g. significant reductions in electricity-consumption, improved environmental "footprints" of data-centres, etc.). Existing pools of DRAM and flash-memory can be sequestered by NoChop to build the required STrees in-memory.

This then is the potential of the NoChop algorithm: simultaneously delivering significant performance increases with commensurate reductions in electricity consumption. From a game-theory analysis with 3 "players", the end-user wins (quicker response times), the developer wins (increased competitiveness), and the planet wins (less greenhouse gasses).

NoChop/STree imageNoChop/STree image