Chapter 7 Continuations

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

My second patent, No. 9,754,205, was issued on September 5, 2017. For this and the other “continuations,” the text of the parent is the foundation. For a continuation, there’s new text that starts after the text for the parent ends. Thus, the text for a continuation is the only new material. To provide you with these examples, I’m presenting only the additional text of the variation on the theme. As before, a set of closing double quotes indicates the end of the continuation text.

Dangerous Documents

Let’s begin.

“Here, deep learning is used to identify specific, potential risks to an enterprise (of which product liability is the prime example here) while such risks are still internal electronic communications (IECs). The system involves mining and using existing classifications of data (e.g., from an internal litigation database, or from external sources such as customer complaints, and/or warranty claims) to train one or more deep learning algorithms, and then examining the enterprise’s IECs with the trained algorithm, to generate a scored output that will enable enterprise personnel to be alerted to risks and take action in time to prevent the risks from resulting in harm to the enterprise or others.

“The difficulty of addressing this category of litigation is high. Product liability lawsuits are based on a strict liability theory. Generally, a plaintiff only knows that he or she has been injured and by what product. All of these facts are post-sale of the product and external to the enterprise. Hence, in the absence of a whistleblower, product liability lawsuits generally have no factual allegations as to facts internal to the enterprise.

“The system remains text based. The external training data consists of the text in warranty claims and/or the text associated with customer complaints to the enterprise or from call centers. Such information should be directed to the legal department before moving inward. To the best of the inventor’s knowledge, this work flow does not exist currently and the risks go undetected. However, such claims and/or complaints may include signals of product deficiencies. Such signals would put legal department employees on the hunt for the internal training data which, in part, consists of ‘dangerous documents.’

“A ‘dangerous document’ is the appellation given to one or more internal communication wherein competing goals are at odds in ways that are potentially dangerous to the enterprise, e.g., when the desire for safety is compromised by a desire for profit or a desire to save time (or avoid a delay). The combination exposes the enterprise to product liability litigation, which is why the system of the present invention is warranted.

“The following example explains the concept of a ‘dangerous document.’ Suppose that an enterprise engineer reports to her manager that a supplier has unexpectedly switched to a cheaper but flammable material, such that the product no longer meets the specifications for being not f lammable, as intended and advertised. She recommends further testing, manufacturing changes, and a recall of the tens of thousands of defective units already shipped with the out-of-specification material. Her manager responds by telling the engineer that she (the manager) is here to bank profits for the company, not incur unexpected costs; and that the engineer is never to use the term ‘defect’ and instead should use only the phrase ‘does not meet specifications.’ The manager ends by telling the engineer to meet her for lunch instead of replying by e-mail. (This example is partially drawn from a Wall Street Journal blog article in May of 2014 by Tom Gara entitled ‘The 69 Words You Can’t Use at GM.’)

“Returning to the external documents, the problem with warranty claims is that they are too often evaluated as to whether the claim should be accepted or rejected, with no further action taken. In the same vein, enterprises often only want to know whether customers who make complaints, e.g., to call centers, are either satisfied or not satisfied, in varying degrees, by the enterprise response.

“However, the external information provided by customers in warranty claims or customer complaints is valuable to the enterprise both as training data for product liability risk and as test data.

“In this variation, the enterprise corporate legal department deploys and operates the system. In the end, the role of corporate counsel will expand to include deciding whether to initiate an internal product liability risk investigation, and then, where warranted, to advise technical control group executives, i.e., executives with engineering, science, or technology oversight and decision-making responsibilities, that external communications by customers are indicating a risk which may prompt them to conduct an internal investigation into whether risky internal communications exist.

“Risky internal communications comprises IECs between technical personnel, on the one hand, or between one or more technical employees and one or more managers, on the other.

“As training data for a deep learning algorithm, warranty claims, customer complaints, in the form of either paper or audio inputs, and ‘dangerous documents’ extracted from previous and now-closed product liability lawsuits, may be digitized, translated, transformed into text, aggregated, and converted to number strings by using a word-embedding technique such as word2vec or GloVe, as previously indicated. However, in the case previous product liability lawsuits have been settled using agreements with confidentiality provisions, the training text need not be specific, and, where confidentiality provisions exist, must not contain customer, employee, and product names and must be redacted unless viewed only by enterprise personnel. Even where confidentiality provisions are not evident, and third parties may view the training data, names of customers, employees, and products constitute ‘noise’ in the data and are best redacted.

“Such text-based data is suitable for input to an RNN-type deep learning algorithm as training data.

“Once an algorithm has been trained by using external and/or internal sources, it can then score internal risky e-mails, just as the system of the present invention instructs, and output to users the algorithm scores and IECs in displays which alert and augment the intelligence of the users as to whether a product liability risk exists or not.

“In this way, the enterprise users, be they legal or technical personnel, may receive an early warning of a defect or similar issue with a particular product. In the event a recall is necessary, the number of units will, potentially, be far less than the number of units that would have to be recalled if the deficiencies noted by customers remain undetected.

“Thus, with an early warning from either external sources or from the legal department, technical personnel will be able to address the design and manufacturing issues raised by the claims and/or complaints sooner rather than later.

“While the invention herein disclosed has been described by means of specific embodiments and applications thereof, numerous modifications and variations could be made thereto by those skilled in the art without departing from the scope of the invention set forth in the claims.”

Contract Failure

My third patent is No. 9,754,206. It was issued right after the previous patent which is why the number is sequential. The issue date is the same: September 5, 2017.

Let’s begin.

“In this variation, one example of a drafting flaw in a document is ambiguity. In a contract, ambiguity may be fatal if incurable by reference to conduct or in some other way, because the parties may have failed to have a ‘meeting of the minds.’

“Using the system of the invention, and with the ambiguity flaw as the focus, a Deep Learning algorithm would be trained with examples of appellate decisions in the PACER database, and other, similar sources, in which a contract is the subject of a dispute and the issue is whether the source of the dispute is whether the language of a provision is ambiguous.

“Using various sources of such appellate rulings, and as a matter of data-mining the factual recitations by a court in such cases, such ambiguous contract provisions themselves may be identified, along with some amount of text before and after the court discussions pertaining to that drafting flaw.

“Once such appellate decisions are identified, the court opinions (and concurring opinions if they deal with the same subject) may be aggregated into a set of training documents.

“The Deep Learning algorithms of the present invention may be trained with such documents to identify documents related to that risk, along with other documents unrelated to it. Documents unrelated to documents which may fail to create contractual obligations may consist of documents with text for a wholly different purpose such as, e.g., news reports, poetry, or science fiction short stories.

“Thereafter, the algorithms trained for various specific drafting flaws may be made accessible to attorneys and paralegals who focus their practice on drafting contracts for transactions, whether such attorneys and paralegals are practicing in the private sector, either as solo practitioners or in law firms, or are employed by enterprise legal departments.

“In this variation, a practitioner would pass, i.e., upload, a draft contract either of his or her own creation, or as drafted by an opposing practitioner (i.e., the test data), to the algorithm. The algorithm would score the entire contract text as against the vector space created by the training documents. The output would be a score for accuracy as to any language in the contract that may be an ambiguity drafting flaw, when the score compares to a high degree with the positive training data for that drafting risk.

“With such an identification of the flaw, a user would access an appropriate appellate court database, e.g., via Google Scholar, and search for and then read the case(s) with the exact or similar phrases, and then decide how best to remedy the flaw.

Consider the Opposite Purpose

My fourth patent is No. 9,754,218, also issued on September 12, 2017.

Let’s begin.

“One or more embodiments of the present invention may be deployed for the opposite purpose. In the description above, the experiments pertained to a specific type of risk to be prevented and involved the corporate legal department. However, as the other side of the same coin, the same system can be deployed to surface evidence that would provide support for a financial advantage and provide notice to appropriate employees of the enterprise for confirmation or rejection.

“As such, the system need not be installed in and operated by the legal department and instead would be the province of the Chief Financial Officer or the departments which concern themselves with accounting, tax, or financial matters.

“For example, in major industries such as energy, oil and gas, and telecommunications, companies generate a plethora of documents with unstructured text which may or may not support a Research and Development (R&D) tax credit. A tax credit would reduce the enterprise taxes and preserve net profit, just as avoiding the expense of a lawsuit would.

“However, in order to discern which of these documents support an R&D tax credit and which of those documents are not supportive, a sorting process is, in current practice, undertaken manually. In each instance, the cost is, like litigation, enormous.

“To achieve greater accuracy at a reduced cost, the same system described herein may be trained with documents that were manually sorted in the past, e.g., in one or more previous years, and for each company in any specific commercial sector of endeavor.

“For each sector and a previous manual sorting process, some documents have been identified as positive support for, or as being related to, a financial advantage. Again, as an example, one particular financial advantage is the sought-after R&D tax credit. The documents supportive of claiming an R&D tax credit constitute a positive training set. The documents which are not supportive may also be useful, however, because they generally identify a classification of documents that are unrelated to an R&D tax credit. Such documents constitute a negative set of training documents.

“As a result, and as with the litigation risk example, a binary set of training documents is available. With such documents, the Deep Learning algorithm(s) in the system may be trained to make a binary choice, namely to score the current set of documents that have not been sorted. The documents with high scores according to the algorithm(s), which are related to the positive training set, may be output and displayed to a user in various ways.

“By using appropriate visualization methods such as bar charts and spreadsheets, a user can make use of the true positives to identify a potential financial advantage to be realized.

I hope you’re getting my point. I’m using my life and subject matter experience to explain these variations. Hold off on thinking about the deep learning applications that may be occurring to you. Make notes, but please keep reading.

Stories

Let’s begin.

“As yet another variation of the invention, consider the entertainment industry. An enormous problem in the industry is the cost of producing fictional entertainment in the form of books, feature films, television, and other media. However, the benefits can also be enormous for stories that are sometimes first rejected but then become so successful that they become, in their own right, entertainment franchises.

“As a result of success after repeated rejections, a problem has surfaced. The problem is that many thousands of writers and other proponents of allegedly meaningful stories present their work for consideration only to find their works rejected by gatekeepers who supposedly ‘know’ which proposals will be commercially viable and which of them will fail. Of course, their views are subjective and often wrong.

“This problem of success after repeated rejections has been identified. A partial list of the authors who were repeatedly rejected by publishers include Agatha Christie (rejections for 5 years), J. K. Rowling (12 publishing rejections), Louis L’Amour (200 rejections before first publication), Dr. Seuss, and, among others, Zane Grey.

“And, as for films, there are at least eight (8) films that were rejected before a studio bought the script and wound up with either a major hit or, better still, a franchise: Pulp Fiction (1994); E. T. The Extraterrestial (1982); Back to the Future (1985); Star Wars (1977); Twilight (2008); The Exorcist (1973); Dumb and Dumber (1994); and Boogie Nights (1997).

“This problem, phrased differently, is this: is there an objective way to assess whether an entertainment proposal will attract an audience large enough to be commercially viable?

“The present invention solves this problem.

“The system enables users to use classified text in a particular context. More specifically, the system begins by amassing an amount of training data from a particular category of text in order to train a Deep Learning algorithm. In this case, however, the corporate legal department may play a role only after a decision has been made to go forward with a book or movie project.

“In this variation, the entertainment categories are well known but have been unappreciated by practitioners in the broad field of artificial intelligence. These categories are called genres. Such genres are known as, for example, dramas, comedies, and science fiction. There are 18 genres and many fan-transcribed movie and televisions scripts, classified by genre.

“From the above sources (and others) of classified text, in either book or script form, two training sets may be created. For example, a ‘positive’ training set may consist of commercially successful books or film scripts, while a ‘negative’ training set may consist of unrelated text such as Wikipedia articles.

“Moreover, it is clear that impactful stories may have structure, as Kurt Vonnegut showed. Using a blackboard, Vonnegut demonstrated that a structure may be depicted over time, with ‘good fortune’ on the positive portion of the y-axis and ‘ill fortune’ on the negative portion of the vertical y-axis, and with time on the horizontal x-axis. Following Vonnegut, at least one Deep Learning practitioner has been able to closely approximate Vonnegut’s shapes and the shapes of Disney stories as well.

“But while the graphic trajectories of stories may be helpful to decision-makers, the point here is to use classified text and Deep Learning algorithms to bring words and sentiments into focus.

“Once trained, the strength of a Deep Learning algorithm may be assessed and visualized using such techniques as Barnes Hut t-Stochastic Neighbor Embedding, and by the Receiver Operating Characteristic (ROC) curve and the Area Under the Curve (AUC), phrases which are typically combined and referred to as ROC-AUC.

“For humans, words are the carriers of emotions. The system of this invention is capable of learning which assemblages of words and sentiments provide output scores more reflective of commercially viable entertainment projects than otherwise. High output scores may be persuasive to entertainment decision-makers, as they constitute an additional, arguably more objective way to assess the potential for a project before substantial costs are incurred.

“Accordingly, this invention calls for a user to select a category and, of the material a user has the right to use, either by copyright or because the material is in the public domain, or otherwise, aggregate the most commercially successful material in digital form and then train a Deep Learning algorithm with that material.

“To clarify, the training data consists of the genre as the ‘label,’ or classification, and the positive set within each genre consists of the material judged by the marketplace to be the most commercially viable. The negative set would consist of unrelated examples from Wikipedia, or some other dataset, and may include the books or film scripts that resulted in substantial commercial losses.

“Next, this now-classified text would be passed to a Deep Learning algorithm for processing. Once the algorithms’ computations are made, each genre-specific algorithm will be in a position to score material it has never before seen, i.e., the material for new projects as they are initially proposed, and as they may be changed and be more fully developed over time by the editing process. Having ‘learned’ what material within a genre has been commercially viable, the system may provide the publishing and entertainment industries with an additional way to better decide which new proposals and projects may be viable, and so enable them to better decide how to allocate their resources.

“In other words, the system will help industry executives avoid the risk of spending enormous sums of money on costly mistakes that could have been avoided.

“As a result, with differently trained algorithms, the submissions of the hopeful creators of new material may be evaluated not only subjectively by humans skilled in the entertainment arts, but also by the more objective system of this invention.

Medical

The “medical risk” patent is No. 9,754,220, and it’s the last of the five patents that were issued on September 5, 2017.

Let’s begin.

“One such variation directs the above-described system to the problem of a missed or mistaken diagnosis, which is an outcome not unlike litigation in that a missed or mistaken diagnosis is an adverse and potentially lethal situation. This problem translates into not realizing that specific tests could be ordered to confirm or reject a diagnosis which the healthcare provider has not made but which is indicated by the patient’s electronic health records (EHRs), including but not limited to the provider’s notes, procedures, medications, tests and test results, pathology reports, and/or diagnostic codes, all of which are collectively hereinafter referred to as Notes. Here, a healthcare provider refers to any professional licensed to provide healthcare services and may include but is not limited to physicians, nurse practitioners, nurses and chiropractors.

“In this variation, a corporate (hospital) legal department may be part of the workflow but is not central to it.

“Nevertheless, this variation involves using the same or substantially similar software system to identify a particular medical risk and provide early warning to healthcare providers. So stated, this variation is entirely different from recent academic papers wherein Deep Learning is trained to make diagnoses or predictions, as if in competition with healthcare providers rather than as an aid to them. See Lipton et al., Learning to Diagnose with LSTM Recurrent Neural Networks (published as a conference paper), arXiv:1511.03677v6 [cs.LG] 1 Mar 2016; and Choi, et al., Doctor AI: Predicting Clinical Events via Recurrent Neural Networks, Proceedings of Machine Learning for Healthcare, arXiv:1511.05942v11 [cs.LG] 28 Sep 2016.

“Briefly stated, the system would be trained by accessing a multiplicity of previous Notes for specific diagnoses, e.g., multiple sclerosis or lupus, where a diagnosis is established by the provider using all the information available in the Notes.

“The training data consisting of a sufficient number (e.g., multiple) of EHRs for each specific diagnosis would be gathered up from now-closed cases, which have been made de-identifiable in the sense of being HIPAA (‘Health Insurance Portability and Accountability Act’) compliant, e.g., where patient names, among other personally identifiable information, have been redacted before the data is passed to the algorithm. A Deep Learning algorithm would be trained for each specific disease by ‘word embeddings’ such as word2vec or GloVe, both of which were previously noted.

“For each diagnosis, the provider-made EHRs (in the now-closed, anonymous examples pertaining to a specific diagnosis) would constitute the training set of documents. These documents would be passed, i.e., uploaded, to the algorithm in order to train it. There would be a trained algorithm for each specific disease, as if it were a specific disease risk filter.

“After training, including testing and tuning, the system would ingest provider-made EHRs (e.g., Notes) in open, pending cases, in near real-time or perhaps overnight or in batches for specific blocks of time (e.g., weekly). The system would pass the Notes as data to the algorithms, each of which would evaluate (i.e., score) the Notes, and, when an accuracy score exceeds a user-variable threshold, send its score as an alert to that patient’s healthcare provider.

“The alert would also include a specification of what disease or diagnosis is indicated by the physician’s Notes, as scored by the algorithm, and may also include a list of symptoms, physical exam findings, tests and/or procedures the provider should consider ordering so that the physician may confirm or deny the diagnosis indicated by the algorithm. As is evident, the alert ignores whether the provider has or has not made a diagnosis.

“In summary, in this variation, the system functions to provide the early warning of a specific medical risk—the risk of a missed or incorrect diagnosis—and so helps to prevent the expense and harm associated with tests, procedures, therapies, or other interventions that may not be germane while also augmenting the physician’s ability to avoid an adverse outcome for the patient.”

Blockchain

The next description is my eighth patent. Where’s the seventh? Since the text of the seventh was the same as the text of the first, and only the claims were different, I’ve skipped it.

For this “blockchain” patent, I have one introductory comment. This variation uses blockchain to solve the problem of a high value use case but where the amount of data for training purposes is small. In a nutshell, I used blockchain to enable deep learning.

The Patent No. is 10,095,992, which was issued on October 9, 2018. It’s the first patent issued by the USPTO that uses the terms “blockchain” and “deep learning” in a patent’s Claims.

Let’s begin.

“In the context of text and supervised learning, a Deep Learning algorithm needs a label, category, or classification, along with a training set of examples (‘Classified Examples’). The unsolved problem is that of a set of Classified Examples that is small.

“How large must the number of Classified Examples be to adequately train a Deep Learning model of the indicated classification? In the early experiments, a Deep Learning model of the invention successfully identified an instance of discrimination risk in Enron e-mails after training a model with only 50 Classified Examples. This was possible even though MetaMind consultants had suggested that the minimum number of Classified Examples needed to train a Deep Learning algorithm was 200 examples. Hence, getting positive results using only 50 Classified Examples was unexpected.

“Moreover, it is generally understood that, when it comes to Classified Examples as training data, the more the better.

“Thus, one or more embodiments of the present invention addresses the ‘small training set’ problem. That is, when the training dataset of publicly available Classified Examples is small, i.e., a number substantially less than 200. In such case, a Deep Learning algorithm engaging in supervised learning will likely have an insufficient number of Classified Examples for the algorithm to function with an acceptable level of accuracy.

“But what if an enterprise owns or has access to a ‘small training set’ obtained as the result of one or more Classified Examples of risks that were settled prior to litigation (such as low-value but high frequency risks, such as employment discrimination disputes, or high-value but infrequent risks such as class actions) or internal investigations into specific risks, such as potential violations of the Foreign Corrupt Practices Act (FCPA), that never became public? Why not combine that data with similar data owned by other enterprises facing the same risk? In current practice, to our knowledge, such combinations of enterprises have not been tried. At least one reason for this lack of cooperation is that no single enterprise is likely to want any other (and possibly competing) enterprise, much less a regulatory authority, to see the internal and private data which pertains to a potentially adverse risk situation (hereinafter a ‘Situation’).

“‘Adverse risk’ includes losses, such as monetary losses, e.g., fines, penalties, verdicts, or orders for disgorgement; damage to brand or product reputations; legal and associated costs amounting potentially to many tens of millions of dollars.

“Losses may also include diminution of health, e.g., ‘zebras’ which are rare but adverse health diagnosis (Zebras). See e.g., U.S. Patent No. 9,754,220, titled ‘Using classified text and deep learning algorithms to identify medical risk and provide early warning,’ to Brestoff et al., which is herein incorporated by reference. Thus, the term Situation should be understood to include Zebras. In that case, where Classified Examples consist of the positive data for the symptoms, tests, and diagnoses which are owned by the patients themselves for each Zebra, the data (and negative data) may be crowd-sourced into a variation of the system described herein.

“As for risks to an enterprise, a few examples will make clear that there are many Situations to address. A potential violation of the FCPA is a clear example of one such Situation. For example, Alcoa’s penalty of $384 million in 2014 only ranked it in 10th position. On November 16, 2017, according to an article in Corporate Counsel, Wal-Mart reserved $283 million as a probable loss, but noted that, in previous filings, the company had reported, for the cost of previous internal investigation, global compliance reforms, and ensuing shareholder lawsuits, a total of about $870 million.

“Other examples are (1) the potential loss of one or more trade secrets, because, by definition, the loss of almost anything marked as ‘secret’ would be considered as both significant and adverse; and (2) a potential product liability lawsuit or class action.

“One or more embodiments of the present invention addresses the ‘small training set’ problem.

“As illustrated in FIG. 6, each enterprise (‘Owner’) identifies training data, i.e., Classified Examples, consisting of a specific situation risk, extracts and stores the data internally (step 602). At step 604, the training data is then processed using a Word Embedding Tool, e.g., GloVe. Word Embedding Tools, such as GloVe, recognizes and skips named entities (that is, words beginning with capital letters, such as company names, individual names, locations, and product names) in creation of the number string.

“Hence, each Owner, as the owner of the data, can turn them into number strings by using a Word Embedding Tool with the same or similar capability, i.e., where named entities are skipped. The result is a small set of Classified Examples, in number string format, for each Situation.

“Thus, as an initial step to ensure privacy, at step 604, each Owner uses the Word Embedding Tool to convert the words that describe an internal investigation into Situation number strings, i.e., Word Embeddings, and does so while the words remain behind the enterprise’s firewall. Thus, the entire process 600 is off-chain. The data an Owner would input to the blockchain consists only of the Word Embeddings derived from the data. However, given that a Word Embedding Tool such as GloVe is a publicly available resource with an open lookup table, these number strings need further obscuring.

“So, even though each participating Owner stands to benefit from the results of the future aggregation of its data with the data from other Owners, and the Deep Learning models that may be built from them, each Owner’s data (i.e., number strings) must be kept private and obscured from not only the other Owners but also the entity building the Deep Learning model for a given Situation (herein the ‘Modeler’). While these and other goals may be achieved contractually, the system and the technology must achieve these goals as well.

“Now two issues arise. The first issue is cheating. Each Owner must be blocked from inputting data that is unrelated in some significant way from a given Situation. Neither the Modeler nor any other Owner would benefit from having any Owner pretend that it is submitting data that is related to a Classified Example when it is instead unrelated. Cheating would unjustly reward a cheating Owner as well as diminish the accuracy of the aggregated data and the Deep Learning model built from that data.

“Instead of a technical solution, then, we assume that each Owner behaves rationally for the following reason: If the resulting Deep Learning model is weakened, the ‘cheating’ Owner who relies on it also will be harmed.

“Thus, the ‘verification’ aspect of this system requires each Owner to enter into the same contract obligating every Owner to provide data related to the given Situation (as opposed to unrelated), where the data consists of communications and documents surfaced by the Owner which are related to the Situation.

“This contract would also bar any Owner from either being the Modeler or colluding with the Modeler to attempt to reverse engineer any of the data input by any of the other Owners.

“Alternatively, in the event a contract proves unsatisfactory, verification may be accomplished by contracting with a trusted third party such as a professor of law who is knowledgeable as to the Situation and is at least familiar with neural network and blockchain technologies.

“Thus, each Owner’s input will consist of the Word Embeddings converted (at step 604) from its Classified Examples using the Word Embedding Tool, e.g., GloVe, plus an agreed-upon shared secret number (described below as the ‘Consensus Number’) which is also unknown to the Modeler (step 606). Each Owner inputs its data to the blockchain using a shared private key (the ‘Input Private Key’) (at step 608). Contractually and otherwise, the Modeler is not given access to (a) the words of any Owner’s Classified Examples or the related Word Embeddings, (b) the shared Consensus Number or (c) the shared Input Private Key.

“The output will be the aggregation of all the Word Embeddings of all the Owners, as illustrated in FIG. 7.

“As indicated in FIG. 6, and while each Owner’s data remains behind its own firewall, each Owner communicates with the other Owners to form a consensus as to a single number that is not shared with the Modeler, but which is known by each of the Owners (i.e., the Consensus Number).

“Each Owner adds the Consensus Number to its own Word Embeddings (step 606) before using the Input Private Key to upload the Owner’s data to the blockchain (step 608). This Input Private Key would not allow an Owner to receive the blockchain’s output.

“The second issue is that the Modeler, who receives the blockchain output, which is the aggregate of each Owner’s data, must be blocked from reverse engineering any Owner’s Word Embeddings.

“To avoid reverse engineering by the Modeler, the system contemplates a contractual provision permitting the Modeler to not only build the blockchain but also a private key the Modeler can use to obtain the output of the aggregated Owner data. This key is the Output Private Key. This Output Private Key would not permit the Modeler to provide any input to the blockchain, and the contract would bar the Modeler from sharing the Output Private Key with any Owner.

“Thus, the Modeler does not know the Consensus Number shared amongst the Owners, or the Input Private Key used by the Owners to input their Word Embeddings to the blockchain. The Modeler knows nothing about any Owner’s data except for the aggregated result which the Modeler can obtain using the Output Private Key. The Modeler’s lack of knowledge concerning each Owner’s number strings will make it nearly (if not entirely) impossible for the Modeler to reverse engineer any Owner’s input back to the original text.

“On the other hand, the Owners know that the Output Private Key the Modeler uses to receive output from the blockchain does not enable the Modeler to gain access to any Owner’s input to the blockchain.

“It is now appropriate to describe specific examples of the highly adverse Situations this system is intended to address. For example, consider the threat posed by a violation of the FCPA. If an FCPA violation is the Situation, the publicly available Classified Examples may come from the databases operated by governmental agencies such as the Department of Justice (DOJ) for FCPA. According to Southern Illinois University Law Professor Mike Koehler, a well-recognized expert in FCPA matters, the number of unique FCPA actions available for public review is, since 1977, only about 75.

“Trade secrets are another example of a Situation. The Brooklyn Law School houses a Trade Secrets Institute, but the number of trade secrets (‘TS’) complaints is also sparse. As of November 17, 2017, the dataset consists of only 86 complaints.

“Product liability is a third example of a Situation. Examples of ‘dangerous documents’ are scarce, however, for at least four reasons: (1) product liability consultants generally advise that companies train their employees to write e-mails and other documents in a careful manner, in part so as to avoid writing e-mails that clearly express ‘safety v. profit’ risks; (2) corporate attorneys occasionally (and inappropriately) offer ‘word rugs’ to employees so that, e.g., instead of saying ‘defect or defective,’ the employees are advised to say ‘does not meet specifications;’ (3) defense counsel in product liability cases cannot, without client permission, disclose the ‘dangerous documents’ they know about; and (4) by regulation, the Consumer Product Safety Commission, after it collects ‘dangerous documents’ during an investigation, cannot disclose them to the public without the company’s consent. See e.g., U.S. Patent No. 9,754,219, titled ‘Using classified text and deep learning algorithms to identify entertainment risk and provide early warning,’ to Brestoff et al., which is herein incorporated by reference.

“In each of these Situations, companies—and their insurance carriers—would benefit from being able to deploy a Deep Learning model that would provide them with an early warning of the risks, so that corporate counsel could investigate them, and either avoid or mitigate the damages.

“In these and similar Situations, the Modeler will find that the factual allegations from the publicly available, but small, datasets are likely inadequate to build a robust Deep Learning model. Such publicly available data may, however, be used as a hold-out set for testing and tuning purposes. Alternatively, such publicly available data may be used to improve the accuracy of the Situation model the Modeler develops after receiving the blockchain output.

“FIG. 7 is an illustration of the ‘on-chain’ process of creating the training dataset for the Deep Learning algorithms. As illustrated, the Word Embeddings 600 from all the Owners (e.g., Enterprises A thru N) are aggregated using a secure multi-party computation (SMPC). In an SMPC, mutually distrusting Owners cooperatively make computations using their still-private data. In the blockchain, the computation is a simple addition or aggregation of each Owner’s Word Embeddings (step 702). That is, data from Owner ‘A’ is added to the data provided by Owner ‘B.’ The result is added to the data provided by Owner ‘C.’ The result is added to the data provided by the next Owner and the next. And so on.

“After the Owner Word Embeddings are aggregated, the blockchain will be able to output the resulting aggregation to the Modeler, who will use the Output Private Key to obtain it (step 704). The Modeler will then use that training data 700 to build a Situation model (‘Model’), as illustrated in FIG. 1.

“As previously noted, the Model can be tested and tuned by accessing the publicly available data, e.g., the sparse datasets for FCPA and TS categories. In addition, the Model can be tested by a willing Owner in one or more pilot studies. Eventually, a tested Model will be provided to the Owners by the Modeler for deployment within their respective enterprises in accordance with the Consortium contract with the Modeler. Similarly, the Modeler may offer a Model to businesses that are not part of the Consortium, and then the Modeler and the Consortium will share in those revenues.

“In addition, the system provides for both positive and negative feedback loops. For example, an Owner may conduct additional internal investigations for a recurring Situation and input the associated (and additional) training data to the blockchain using the Input Private Key.

“And, after deployment, an Owner may pass its internal enterprise communications through the Model and see that the Model reports a small fraction of the results as ‘related’ to the Situation, and to what degree. Owner personnel, when reviewing such results, may find that some of the results are true positives and warrant the opening of an internal investigation. Some of the results will be false positives. Owner-specific true positives and false positives, whether from publicly available text in filed lawsuits or from private internal investigations, may be flagged over time, and aggregated, such that the data may be sufficient for a company-specific variation of the Model. This variation would not involve the other Owners.

“Also, in the Owner-Modeler contract, or by amendment to it, the system could be re-configured to permit the inclusion of new Owners.

“The system of this invention contemplates that each multi-party group of Owners is organized by the Modeler and may be viewed as a Situation ‘Consortium.’ However, the system would function in the same way if one or more Owners formed a Consortium and engaged a Modeler. Notably, the Modeler is a member of the Consortium primarily for the purposes of providing the resulting Situation-specific Deep Learning Models to the Members on such terms as may be set by contract, and to market the Models to non-Members on such terms as may be commercially viable.

“Since the contractual and business provisions governing a Consortium are not pertinent to the operation of the system, they will not be discussed.

“However, it now appears that there is one significant business reason for an Owner to join a Consortium, and not cheat the others. When the system described above is deployed and operated in good faith, it may protect not only each Owner but also the individuals comprising each Owner’s decision-makers from criminal actions against them by regulatory authorities such as the U.S. Department of Justice. The reason the system may provide such protection is that its deployment and good faith operation stands as evidence of a specific intent to avoid harm, which undercuts any contention of a specific intent to do harm. Since there is currently no such thing as a ‘compliance defense,’ the deployment and ‘good faith’ operation of the system described herein for early warning may be the best available defense to any accusations of wrong-doing.

“Finally, the scope of this invention is not meant to exclude Convolutional Neural Networks (CNNs). Typically, when we think about CNNs, computer vision comes to mind. After all, CNNs were used in the major breakthroughs involving image classification, e.g., for use by self-driving cars, and other applications for automated photo tagging. In that instance, the problem of small training sets is very well appreciated. However, unless barred by the nation’s antitrust laws, the preventive nature of this invention could apply to images as well, especially in the realm of machines engaged in manufacturing a product. In this context, instead of avoiding risk, the system may increase the efficacy of ‘preventive maintenance’ procedures and operations.

“As an example, consider the problem of property or injury-causing accidents caused by self-driving cars. By design, the CNNs therein are trained with large datasets of video which, among other things, are intended to minimize if not avoid accidents. Accordingly, property and injury-causing accidents are intended to be rare and to occur only occasionally. Video examples may be few and far between, and the same ‘small training set’ issue may arise. However, for example, the manufacturers of self-driving cars may have, when aggregated, a sufficiently large dataset for analyzing the images leading up and just prior to the accident itself. Accordingly, they may wish to form a Consortium and use the system of this invention as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7 Continuations

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 7 Continuations