Sunday, June 10, 2018

Checklist for Artificial Intelligence

Checklist for Artificial Intelligence

Blaise Byron Faint

I propose the establishment of a specialized, all-purpose repository of artificial intelligence code and processed datasets to facilitate education, research, and practical applications of artificial intelligence.


1 Introduction

Checklists are great.[1]

As someone who is new to studying artificial intelligence (AI), machine learning(ML), and deep learning (DL), I am overwhelmed by the complexity of the topic.

For example, here is a Machine Learning Algorithm MindMap by Vineet Verma.[2]

Right now, when I consider a set of data, I have no clue which of these many algorithms is (are) appropriate for that particular use case.

In addition, if I’m considering tokenizing the text of War and Peace, shouldn’t I investigate whether that text has already been tokenized, to avoid duplication of effort?

(Along those lines, I suggest the establishment of a sister website to the various iterations of Project Gutenberg, in which all of the available texts are tokenized, with boilerplate text that is replicated in many texts commented out.)

If I do happen to have a guess as to which algorithm to use, I currently have to manually search the  web (Github, specifically) for sample code to fit my imagined use case.

In addition, as I understand it, the next big goal of AI is artificial general intelligence (AGI), in which  a computer is able to mimic all of the functions of the human brain.[3]

Especially for someone in the process of learning AI, and given that it’s difficult to ascertain which algorithm is most appropriate for a particular use case, why not use all of them and find out?

I suggest the establishment of a generalized repository of mature artificial intelligence algorithms, similar to Chocolatey or some other package manager, optimized for simplifying the task of  processing data, designed for 3 main purposes:
A. Teaching
B. Pure Research
C. Practical Applications

2 Example proto-checklist

The checklist repository would be downloadable or accessible via the cloud. It could have a teaching mode, in which the master program walks the user through each step, to offer sample code and  explanations of why the program recommends a particular algorithm for a specific set of data, a practical use case, in which the end user already knows which algorithms and hyperparameters they  wish to use, and a God-mode, in which the master program manipulates the data in every conceivable way, subject to the limitations imposed by the available processing power.

For example, I open the master program and choose “Teaching,” “Practical Use Case,”, or “God Mode”. Based on my initial selection, the computer leads me through an opening dialog using  checkboxes, drop-down menus, or natural language processes, to determine what I am trying to accomplish or if I even know what I’m trying to accomplish. For example, in “Teaching” mode, I  select a file or folder on my local machine.

The master program examines the folder (or file) and by file extension(s) determines whether the data
is homogenous, or a mixture of csv, txt, jpg, or other files.

The master program opens each file and determines whether the data has been pre-processed, or whether pre-processing is required. It could then either attempt to pre-process the data or inform me  of what pre-processing is required.

If the file is a csv of historical stock prices, the master program might suggest a time-series algorithm. If the data is known to be pre-processed, the master program could find this information via the checklist repository.

If the data is images, the checklist repository could convert the images to text. If the data is text, the master program could convert the text to graphs and images.

Ultimately, by working through the checklist of mature algorithms, the master program could find answers to questions the end user didn’t even think to ask.

End users could then share their conclusions in a specialized format intended to extend the functionality of the checklist repository, just as Chocolatey relies on files optimized for that particular use case. In other words, instead of the end user Googling “MNIST” and finding The MNIST  Database, [4] downloading the files manually, and processing them for the undecillionth time, the  checklist repository would access this type of information automatically and walk the end user  through the sample code, results, and so on.

3 Advantages and disadvantages

The major disadvantage of this approach is that the processing power used could quickly grow exponentially and beyond the ability of the end user to manage with limited computing resources. The advantage is analogous to Frank Zappa’s explanation of the decline of the music business:

The executives of the day were “cigar-chomping old guys who looked at the product and
said, 'I don’t know. Who knows what it is? Record it, stick it out. If it sells, alright!'”[5]

Given that I don’t know which algorithm is most appropriate, rather than guess, I believe that  allowing the checklist repository to thoroughly examine the data will yield potential benefits that will ultimately expedite the arrival of AGI and artificial super intelligence (ASI).

4 Conclusions and future work

Don’t be the “hip, young executives” who turned out to be far more conservative than the “old guys”
ever were.

Who knows?


I acknowledge that water is wet.


[1] Gawande, Atul (2009). The Checklist Manifesto: How to Get Things Right. Metropolitan Books.
[2] Verma, Vineet (2015). “Machine Learning Algorithm MindMap.”
[3] Woodford, Chris (2018). “Neural networks.”
[4] LeCun, Yann, Cortes, Corinna, and Burges, Christopher J.C. (2013). The MNIST Database.
[5] Zappa, Frank (1987). “Frank Zappa Explains the Decline of the Music Business.”

This paper originally published June 10, 2018.

Thursday, June 07, 2018

Personal Recollection of 1980s Tech

This post contains my personal recollection of various equipment I used or aspired to use in the approximate time-frame of the 1980s. My point is that all of this equipment has largely been supplanted by smartphones today.

Discriminatory Adversarial Nets

Author's note: I am practicing writing papers. The discriminating reader may detect my dry sense of humor.

Discriminatory Adversarial Nets

Blaise Byron Faint

The goal for generative adversarial nets is to empower generators to defeat discriminators in as many use cases as is practicable. In my humble opinion, this is dangerous. I call for the development of discriminatory adversarial networks, in which the goal is for the discriminators to defeat the generators.

1 Introduction

“Generative Adversarial Nets” (GANs) [1], a 2014 paper by Ian Goodfellow et al, has led to powerful growth in the deep learning sphere. However, I believe that this approach is dangerous, as artificial intelligence is already so powerful that the average consumer is overwhelmed and unable to accurately distinguish fact from fiction.

Various tools have been developed to combat the disinformation generated by bots and other agents of chaos, including Elon Musk has proposed the development of a website, Pravda[2], to help combat purveyors of disinformation, but even this is insufficient, as, “A Lie Can Travel Halfway Around the World While the Truth Is Putting On Its Shoes.” [4]

Elon Musk has accused the media of a multitude of abuses.[3] Any discriminatory agent will ultimately prove ineffective under the GANs model, as this model is optimized for the generators to defeat the discriminators.

In this scenario, Ian Goodfellow is a modern Prometheus who has gifted artificial narrow intelligence with fire. I believe that what is needed is water, to fight the fire. I propose Discriminatory Adversarial Networks (DANs), optimized so that no matter how good the fakes become, the discriminators are able to detect the fakes and warn the end user, to minimize the damage created by “fake news”.

I am just beginning my study of artificial intelligence, machine learning, deep learning, and so on, so I do not yet have the technical proficiency to develop the precise mathematical formulas or computer code to achieve this goal, so for the time being, I leave that task to others. In general, I believe that this can be accomplished by minor tweaks to the current GANs model. I assume it will be sufficient to, "reverse the polarity of the neutron flow."[5]

2 Advantages and disadvantages

The advantage of this new model is self-evident. GANs are already more powerful than most humans can comprehend, as evidenced by the recent proliferation of “deep fakes”.[6]

In my humble opinion, wizards should develop discriminatory tools to help end users detect fakes at the operating system level, with adjustable settings such as low-level warnings, to high-level blocking of fake information.

The disadvantage to this approach is that there will be an unending arms race between discriminators and generators. In my view this is unavoidable.

3 Conclusions and future work

The GANs model has been influential for 4 years, and to my knowledge, there has not been an equal and opposite effort to empower the DANs model. Most people have no grasp of the implications of the coming Singularity.

To review, there have been 2 major revolutions in human history:
1) Agricultural Revolution: powerful tools that manipulate flora and fauna.
2) Industrial Revolution: powerful tools that manipulate raw materials.

There are 3 revolutions pending:
3) Artificial Super Intelligence Revolution: powerful tools that manipulate 1s and 0s.
4) Atomically Precise Manufacturing Revolution: powerful tools that manipulate matter.[7]
5) Profound Energy Manipulation: powerful tools that manipulate energy.

From our current vantage point, homo sapiens does not have the capability to determine whether the most benign or the most malignant future scenario will be a net benefit for humanity. Therefore, all of us have a responsibility to take all practical steps to proceed with caution.[8] Humanity must not play with fire without having tools (water) to combat its potential and likely unintended negative consequences.


I acknowledge the tuition of Siraj Raval.


[1] Goodfellow, Ian J., Pouget-Abadie, Jean, Mirza, Mehdi , Xu, Bing, Warde-Farley, David, Ozair, Sherjil, Courville, Aaron, and Bengio, Yoshua (2014). Generative Adversarial Nets. In NIPS 2014.

[2] Musk, Elon, May 23, 2018. “Going to create a site where the public can rate the core truth of any article & track the credibility score over time of each journalist, editor & publication. Thinking of calling it Pravda …”

[3] Holley, Peter, May 24, 2018. “Pravda: Elon Musk’s solution for punishing journalists.

[4] Unknown, Unknown. “A Lie Can Travel Halfway Around the World While the Truth Is Putting On Its Shoes.”

[5] The Doctor, Pertwee, John, and Dicks, Terrence, 1972. “Reverse the polarity of the neutron flow.” Doctor Who, 9th Season, 3rd Serial, "The Sea Devils", 1972.

[6] Chesney, Robert, and Citron, Danielle, February 21, 2018. “Deep Fakes: A Looming Crisis for National Security, Democracy, and Privacy?”

[7] Drexler, K. Eric, 2013. Radical Abundance. PublicAffairs.

[8] Lee, Stan, and Parker, Uncle Ben, 1962. Amazing Fantasy #15 (first appearance of the Amazing Spider-Man). “With great power comes great responsibility.” Timely Comics (Marvel Comics).

This paper originally published June 7, 2018.