Explore & Exploit New Ideas & Developments in IT

fast & efficient languages, algorithms & data-structures from www.agdresearch.com

In R&D terms (research-and-development), AGD Research is a pure R organization (business and network). This site showcases its' current-findings to software-developers, and all who have an interest. The site-content is free-source IP, is in the public domain, and there to be utilized by any-and-all, however they wish.

The web-pages are designed to convey a summary understanding. For a detailed understanding there are downloads available on the Downloads web-page. These contain working examples, the computer code (MS VBA), and white-papers.

For a lighter, less techy perspective on AGD Research, additional pages cover "adult" stories focusing on the realities of business-consulting, other stories, amusements, and a few "blue-sky" thoughts.

Symbolic-processors form a putative new application-type which possesses a universal data-model (the spectral-data-model) A symbolic-processor is a new approach to a new requirement, which in IT "speak"/vernacular is termed a "new-new" solution.

UK/Netherlands based, AGD Research (founded in 1999) provided niche services to the major IT consultancies and fortune-500 companies in the US, Europe and APAC. These services have focused on implementing sophisticated optimization algorithms for supply-chains and value-chains.

AGD Research is also an informal network of consultants and technicians who have worked closely with each other over many years on multiple projects, and whose expertise has been combined and maximized to deliver significant business benefits.
About imageAbout image
A symbolic-processor (the Downloads web-page contains a proof-of-concept) is a putative new application-type that interprets SPL (symbolic-processing-language) and has been designed to enable this logic-chain:

Goal: A new, simple, single, & universal way of processing data (so commoditizing it)
Problem: A myriad of existing, competing and complex computer-languages
Solution: Extend the universal notation of arithmetic into computer-programming
Mechanism: Adopt the 4 principles below:

1. No Mathematical-Platonism
Traditional computer-languages, and indeed mathematics itself, use symbols to represent underlying objects that exist "out-there", such as integers, reals. bytes etc. These objects can then participate in various operations that again exist "out-there", such as addition, multiplication and concatenation etc.

Unlike this mathematical-Platonism, a symbolic-processor just processes symbols; no pretense is made of any underlying spooky "out-there" reality that they refer to; e.g. the statement abc + def is equally as valid as 123 + 456; it just returns a symbolic-string of zero length (nothing) as compared to the 3 symbol-string (579).

2. Spectral Data-Model

The spectral data-model maps symbolic-strings to physical data-structures and data-stores, whether these be in an application's fast-memory, an underlying file-system or database, on the web, etc. This allows a common way of maintaining data in any application, anywhere, whether it be a spreadsheet, a word-processor, an ERP system, accounting software, B2C retail, or whatever. The accompanying fractal tree-diagram (the pattern repeats itself in ever increasing granularity/resolution) represents the spectral data-model.

Data within a string has left-to-right asymmetry; leading symbols are more "well-known" and represent its key (the how-to-find) and are represented by colours towards the blue-end of the spectrum, while trailing symbols are less "well-known" and represent the record itself (the what-to-find) and are represented by colours towards the red-end of the spectrum.

The spectral data-model is implemented via 2 approaches. Firstly a thin processing layer is introduced at the application level to translate the local data-model (e.g. relational, file-system, spread-sheet etc.) into and out of the spectral data-model. Secondly data within the spectral model is held as a "Trie" data-structure. This data-structure has been optimized via "go-faster-striping" meta-data to deliver maximized performance (see the "Striped-Trie" web-page for details).

3. Combinatory Binary-Operations

With the exception of read-operations which utilize wild-cards and filters, all other operations in a symbolic-processor are binary; they have 1 operator, 2 input operands and 1 output, as in 1 + 2 which returns 3. These operations can be combined in various combinations across symbolic-strings to produce bijections and various Cartesian-products:
1 2 + 3 4 5 returns 1 5 4 5 (example standard)
1 2 ++ 3 4 5 returns 4 6 (example bijection)
1 2 +++ 3 4 5 returns 13 14 (example Cartesian-accumulate)
1 2 ++++ 3 4 5 returns 4 8 13 5 9 14 (example Cartesian-carry)
1 2 +++++ 3 4 5 returns 4 5 6 5 6 7 (example Cartesian-classic)

4. No Explicit Loops, Routines or If/Then Statements
The functionality above almost-entirely removes the need for a symbolic-processing-language to have explicit loops, routines or if/then statements. More fundamental operations (comparison and replication) can be used to provide this implicit functionality if required.

Details
For a detailed understanding there are downloads available on the Downloads web-page. These contain working examples, the computer code (MS VBA), and white-papers. In an example of determining the standard-deviation of a sample-set the number of lines-of-code is reduced from 25 in "C" to just 5 in SPL (symbolic-processing-language).






Symbolic-Processor imageSymbolic-Processor image
A striped-trie (an "STrie" for short - the Downloads web-page contains examples and white-papers) is a data-structure optimized for input-output performance for near key/entry matches (i.e. for next and previous key/entry location and maintenance - not just exact matches as in other algorithms such as hashing).

Goal: A trie data-structure optimized for use in the spectral data-model & other applications (e.g. data sorting, retrieval & maintenance etc.)
Problem: Standard tries are inefficient at data-storage and sub-optimal in near key/entry matching
Solution: Minimize data-storage and maximize near key/entry location performance
Mechanism: A 256 bit ascii pointer vector approach in combination with meta-data striping/clustering

The first accompanying trie diagram of 4 colours has 3 nodes and 6 branches (one of which is empty). Navigating from the root-node outwards through all the branches and sub-nodes returns the 4 colours "Marigold, Magenta, Magnolia and Mauve" (the 4 keys/entries) as a sorted sequence.

In general, if n is the number of keys/entries and n+ the number of nodes, the nodes are represented as a 256 bit ascii pointer vector ("IT-speak"!). In perhaps simpler-but-equivalent "math-speak", they are represented as a sparse 2-dimensional integer matrix of n+ rows (the nodes) and 256 columns (the ascii character-set). This forms a "DNA" sequence of characters/symbols that navigates from the root-node outwards and reconstructs each individual key/entry.

The second accompanying diagram demonstrates STrie's superiority over BTree in that having multiple branches per node results in less processing levels being required to locate a key/entry. The maximum of 256 branches for each node results in 8 times fewer levels (2 ^ 8 = 256), giving a rough-and-ready speed-limiter of n(log (n) / 8 in "big O" notation; 8 times faster.

Branches are represented as a 1-dimensional string-array of n+ rows. This forms a "junk-DNA" sequence of characters/symbols which do not affect the "node-navigation", but do contribute to the reconstruction of the key/entry. This "junk-DNA" usually comes at and/or near the end of keys (e.g. "ffalo" in "Buffalo"), but can occur anywhere (e.g. "est " in "West Carson" and "West Chester"). If a row-pointer is non-zero its column-position represents the ascii code concerned (e.g. a 123 in column 102 means "B"s are located in row 123, and a 4567 in column 165 means "u"s are located in row 4567).

The third accompanying striping/clustering diagram shows how the sparse integer matrix is refined by meta-data which indicates where non-zero row-pointers occur within the nodes (i.e. "where the action is"). The vertical stripes partition the ascii character-set into like-bands where data tends to cluster (e.g. alphabetics, numbers, etc.). The horizontal lines and points demarcate within each node where the data actually clusters which minimizes the number of linear-scans required. In this implementation if there are 3 or less row-pointers they are represented as point-vectors (e.g. "r", "g" and "u"), otherwise they are represented as a range-vector and a single point-vector (e.g. "B to Q" and "Z").

This approach is essential to maximize near key/entry location performance which is required to out-compete current algorithms that sort, retrieve and maintain data (e.g. QuickSort, BinarySearch and BTree etc.). These are required for applications such as symbolic-processors, sort-engines, spell-checkers and text-auto-fillers, dictionaries, relational and NoSQL databases, file-systems, etc.

The Downloads web-page contains working-examples and white-papers of STrie versus QuickSort (50+ percent faster), STrie versus BinarySearch (350% faster) and STrie versus BTree (100+ percent faster). The proof-of-concept code is written in MS VBA which was chosen for its ability to seamlessly integrate with Excel.

Although these are significant the compelling differentiator of the algorithm is its suitability for massive parallelization. Both data and processing can easily be spread across CPUs. The simplest way of doing this would be to front-end the algorithm with a triaging routine that distributes data (and hence processing) to 1 of 256 CPUs, based on the initial character of the STrie. A more sophisticated triaging routine would better-balance loads based on usage criteria.







Striped-Trie imageStriped-Trie image

Download (to .zip compressed-format), unzip (to .xslm Excel/VBA-format), allow security if required (right-click on the file in file-editor, select Properties, then click the Security-Unblock box), open the file and respond to the "Enable" prompt(s). The user can first view the overview, tabs and white-paper safely (decline "Enable"), before enabling algorithms/VBA (accept "Enable"). Compatible with both 32 and 64 bit pc's.

Symbolic-Processor

Theory (white-paper) & Proof-of-Concept (command-line). Version 14.

STrie-vs-QuickSort

Theory (white-paper) & Proof-of-Concept (command-line). Version 14.

STrie-vs-BinarySearch

Theory (white-paper) & Proof-of-Concept (command-line). Version 14.

STrie-vs-BTree

Theory (white-paper) & Proof-of-Concept (command-line). Version 14.

AGD Research exists as both a network of IT consultants and as a provider of professional services to customers via associate organizations EDO and Peningo. Services include consulting, implementation, design, education-and-training, coding, and documentation etc.

To join the conversations surrounding Symbolic-Processors and supply-chain/value-chain optimization please contact AGD Research directly via the Contact web-page. To explore resourcing options please contact the above associate-organizations directly.






Services image

This page includes Word-downloads of "adult" consulting-stories, in approximate chronological order. These took place mostly in the 1990's before AGD-Research bv. was established in Amsterdam in 1999. Only the names and roles of the participants have been changed to protect the innocent, and not-so-innocent!

Worst-Ever-Taxi-Ride

This story focuses on taxi rides, a key part of the consulting process that enables the consultant to transfer from-and-to the airport or hotel to the client. Of the thousands of rides that consultants associated with AGD-Research undertook, this is the most memorable for all the "wrong" reasons.

Autobahn-by-Kompressor

This story follows a "day-in-the-life" of a sales-consultant and focuses on a particular car journey to-and-from the prospect.

Best-Ever-Taxi-Ride

This story focuses on taxi rides, a key part of the consulting process that enables the consultant to transfer from-and-to the airport or hotel to the client. Of the thousands of rides that consultants associated with AGD-Research undertook, this is the most memorable for all the "right" reasons.

Shinjuku-East

IT-system-implementations allow the international-consultant opportunities to explore exotic locations at no costs to themselves. In this account I describe visiting Kabukicho in Shinjuku-East, central Tokyo twice, once in the 90's and again twenty years later, and learning at first-hand how that justifiably famous red-area has evolved.

Taking-the-Boss-to-an-SM-Club

This story explores the pros-and-cons, do's and don'ts, for all those taking their boss to an S&M club (or thinking of doing so) for a fun night-out before an important business conference the next morning, and recounts a real-life-example.

Coming-Out-as-SM-to-HR

This account describes the experience of "coming-out" as SM to HR (human resources), why that difficult-conversation was necessary, and the consequences of it.

Hospital-Pass

This story explores the importance of trust-and-teamwork during the early-sales-cycle, and recounts a real-life-example of when it goes horribly-wrong!

The-Italian-Connection

This account explores the difficulties involved in securing recreational-supplies over a stopover-weekend while implementation-consulting in a picturesque hill-town in Italy.

Three-Unforgettable-Questions

This account focuses on three unforgettable-questions I was asked when toothache struck while consulting in Switzerland.

Work-Life-Balancing

This story highlights the conflicts involved in balancing in running a successful Amsterdam based IT-consultancy with an active SM lifestyle.

Games can (arguably) be characterized as satisfying two goals; being easy-to-play and having large-user-control over the result. Games that are easy-to-play include fruit-machines, bingo, roulette, pachinko, snakes-and-ladders, rock-paper-scissors, snap and most patience card-games. Ones that have large-user-control include chess, bridge, poker, scrabble, and most multi-player card-games. The few that satisfy both goals are (arguably) checkers/drafts, dominoes, monopoly, pinball and many video-games from pong onwards. 3x3 (three-by-three) is a patience card-game designed to join these few by being both easy-to-play and having a large-user-control over the result.

3x3 is so simple-to-play it actually easier to explain in a single paragraph of words rather than pictures. Take a pack-of-cards, and if it has no jokers, remove/discard a single court-card (jack, queen or king), then shuffle it. Then turn-over the top card, and place it in an empty "square" anywhere you like in a 3x3 grid (hence the name!), and continue until the grid is full. If at any time the sum of cards in a row, column or diagonal add-up to a number that is divisible by five, they can be removed from the grid (court-cards and jokers count as 10). You win if all the cards from the deck and grid are removed, otherwise you lose. The accompanying diagrams show an example game of 3x3 utilizing some common tactics and strategies.

In-diagram-1 the grid is being filled from the left-to-right, bottom-to-top. The queen is placed out-of-sequence in the top-left "square" as this ensures the sum-of-cards in the first-column is divisible by five (10 + 5 + 10 = 25 = 5 x 5). In-diagram-2 the rest of the grid is populated left-to-right, bottom-to-top. In-diagram-3 the cards in the first-column are then removed/discarded.

In-diagram-4 the first-column is then re-populated, with the six immediately placed in the centre as this ensures the sum-of-cards in the middle-row is divisible by five (6 + 4 + 10 = 20 = 5 x 4). In-diagram-5 the cards in the middle-row are then removed/discarded.

In-diagram-6 the middle-row is then re-populated, with the seven immediately placed at the centre as this ensures the sum-of-cards in the middle-column is divisible by five (3 + 7 + 10 = 20 = 5 x 4). The two is then placed to the right as this ensures the sum-of-cards in the right-row is also divisible by five (4 + 2 + 2 = 10 = 5 x 2), before the king is placed to the left.

In-diagram-7 the cards in the middle-and-right-rows are then removed/discarded. In-diagram-8 the  middle-and-right-rows are then re-populated, left-to-right, bottom-to-top, and somewhat-luckily, the sum-of-cards in the backslash-diagonal is divisible by five (1 + 1 + 8 = 10 = 5 x 2). In-diagram-9 the cards in the backslash-diagonal are then removed/discarded.

In-diagram-10 the backslash-diagonal is then re-populated, with the seven immediately placed at the centre as this ensures the sum-of-cards in the middle-column is divisible by five (1 + 7 + 7 = 15 = 5 x 3). In-diagram-11 the cards in the middle-column are then removed/discarded. In-diagram-12 the middle-column is then re-populated, bottom-top. Unfortunately this results in no rows, columns or diagonals being divisible by 5, and the game is lost.

The above example highlights some tactics and strategies that can increase the chances of winning the game, including planning to remove horizontal and vertical rows alternatively, and being aware that cards that count as ten are much more common than others. Also, placing a card that is divisible by five in the central "square" creates four opportunities for cards that count as five or ten to complete a removable threesome (1 row, 1 column and 2 diagonals), whereas placing it in a corner creates three opportunities (1 row, 1 column and 1 diagonal), leaving just two opportunities anywhere else (1 row and 1 column). An additional strategy is card counting; cards that are removed/discarded will weight the likelihood of subsequent ones having a different value.

The game can be made easier-or-harder by subtlety changing the rules. The easiest game includes the divisor being five, diagonals being removable, and the grid can migrate (i.e. if there are just two rows populated, the re-populated row can be placed above or below them, and likewise for columns). With these lax rules in-place, and with significant-player-experience, leads towards fifty-percent of games being winnable.

The hardest game includes the divisor being ten, diagonals not-being removable, and the grid cannot migrate (i.e. if there are just two rows populated, re-population can only occur in the empty-row, and likewise for columns). With these stringent rules in-place, winning is a significant challenge!

Finally, what the best tactics and strategies are, and the resulting likelihood of winning the game, is a non-trivial mathematical calculation that the author has not even attempted to ascertain!





3x3 image3x3 image
To contact AGD Research (the network and business) about any aspect of Symbolic-Processors or supply-chain/value-chain optimization please email aguthriedow@yahoo.com or novella.foster@gmail.com to receive additional information. This includes copious additional implicit knowledge (i.e. that which is not recorded on any media, such as what didn't work and why, and the underlying sub-cultural influences that helped drive these discoveries and developments). This augments-and-complements the explicit knowledge detailed in these web-pages and download-files.









Contact image