MAIN ACADEMIC CONTRIBUTIONS OF
G. WILLIAM MOORE, MD, PhD.
9/23/2005.
http://www.netautopsy.org/mainacad.htm


Send comments and correspondence to: George.Moore4@va.gov

See also: http://www.medparse.com/gwmcv.cv .............


SUMMARY OF ACADEMIC CONTRIBUTIONS:
MATHEMATICAL MODELS OF DISEASE PATHOGENESIS.



Contributions of over 180 papers published in peer-reviewed journals, spanning an experience of over forty years in the study of mathematical models of human disease.
The major subject headings are: 1. Separate. 2. Connect. 3. Estimate. 4. Swap. 5. Translate. 6. Harm. 7. Order. and 8. Grow.

0. INTRODUCTION.



1. Separate. In medical diagnosis and therapy, the first step is to separate patients who need medical care from those who do not. In the simplest case, if patients are measured along a single, linear scale, then one may observe a histogram, or distribution, of patients with respect to this measurement. If this histogram is bimodal (i.e., has two peaks), then possibly these peaks correspond to the absence (left) or presence (right) of a medical condition.
654.



2. Connect. The second step in medical care is to connect one group of patients with another group. In the simplest case, one may associate patients with many features in common; equivalently, one may dissociate patients with few features in common.
660.



3. Estimate. Considerable medical information is collected at varying levels of uncertainty, from which one must estimate the true value state of the patient's illness .............

4. Swap. For data collected under uncertainty, there will often be a certain proportion of misclassifications, either false negatives or false positives. ................ swap .............

5. Translate. Medical histories, physical findings, and anatomic pathology reports, are written in free-text, and will be so in the foreseeable future. Therefore, we must translate this essential information about a patient into a form that can be reviewed for quality assurance and other administrative purposes.

6. Harm. Medical ethics begins with Hippocrates' famous dictum: First do no harm. In science, one pursues truth at all costs. In patient care, the pursuit of truth in diagnosis must be tempered by the harm that it causes to obtain the information. This fact is recognized in all civilized societies by the profusion of regulations and laws that prevent patient harm, or punish a heath-provider who causes patient harm.

7. Order. ............. order .............

8. Grow. ............. grow .............

1. Separate.
MINIMUM SQUARES RATIO METHOD.



In a bimodal histogram (i.e., histogram with two peaks), the question arises whether two coherent clusters of data are present in the histogram:

654.


Coherence is understood here as low variance or low sum-of-squares. That is, the sum-of-squares of the histogram to the left of the dotted line plus the sum-of-squares of the histogram to the right of the dotted line should be small relative to the sum-of-squares of the entire histogram.

The MINIMUM SQUARES RATIO METHOD examines data that have been separated at every possible dividing point along the x-axis (horizontal-axis), and tests whether the resulting clusters are significantly coherent. The TOTAL SUM OF SQUARES, or SECOND MOMENT, denoted TSS, for the histogram as a whole, is given by the expression:
TSS = ∑i (xi - X)2
where the ith histogram-block has value xi, and X is the grand mean of all the n points, i.e., X = [∑i=1n xi] / n.

If one divides the histogram into a LEFT HALF and a RIGHT HALF (dotted line), then the LEFT SUM OF SQUARES is given by the expression:
LSS = ∑j (Lxj - LX)2
and the RIGHT SUM OF SQUARES is given by the expression:
RSS = ∑k (Rxk - RX)2
where the Lxj are left-sided histogram-blocks; the Rxk are right-sided histogram-blocks; LX is the left-sided mean, i.e., LX = [∑i=1n Lxi] / n; and RX is the right-sided mean, i.e., RX = [∑i=1n Rxi] / n.

The computer algorithm examines all possible left-right dividers, and returns the divider with the MINIMUM SQUARES RATIO, MSR, defined as MSR = (LSS+RSS)/TSS, for the minimum possible left-right divider. A SIGNIFICANCE TEST for a given histogram of sample size n is obtained by comparing the calculated MSR for that histogram against a distribution of random histograms of the same sample size.

The distribution of random histograms of sample size n is obtained by MONTE-CARLO SIMULATION (Cashwell and Everett, 1959; Moore and Berman, 1991; Berman and Moore, 1992). The standard distribution for comparison may be a normal distribution, or any other suitable distribution. For example, a histogram of sample size n=100 is compared to a Monte Carlo simulations drawn repeatedly in sample sizes of n=100, from a normal (Gaussian) distribution. The significance level at p=0.05 corresponds to the least 5% of minimum-squares-ratios among the samples drawn.

It can be proved by ordinary algebra that if TSS > 0, i.e., if not all histogram-blocks have the same value, then 0 < squares ratio < 1.

THEOREM. Total sum-of-squares, TSS = 0, if and only if all xi are equal.
PROOF: IF. By definition of grand mean, X = [∑i=1n xi] / n = x1 = x2 = ... = xi = ... = xn. Then TSS = ∑i (xi - X)2 = ∑i (xi - xi)2 = 0.
PROOF: ONLY IF. ..............

THEOREM. The squares ratio lies between 0 and 1, i.e., 0 < (LSS+RSS)/TSS < 1, and strictly less than 1 if at least two xi are unequal.

NOTE. The minimum squares ratio for the uniform distribution is 1/2. The minimum squares ratio for the normal distribution (large sample) is approximately 0.38.

COMPUTER PROGRAMS. A computer program for performing MSR was originally described in:
Albert S, Wolf PL, Pryjma I, Moore W.
Thymus development in high- and low-leukemic mice.
J Reticuloendothel Soc. 1965;2:218-237.

Albert S, Wolf PL, Loud AV, Pryjma I, Potter R, Moore W.
Spleen development in mice and high- and low-leukemic strains.
J Reticuloendothel Soc. 1966;3:176-201.
and later in:
Moore GW, Berman JJ, Sydnor DL.
Automated edge detection in image analysis: distinguishing the nucleus from the cytoplasm without a user's threshold estimate.
Am J Clin Pathol. 1994;102:539.
http://www.netautopsy.org/ascpedge.htm

Moore GW, Berman JJ, Moore GW, Brown LA.
Software for image segmentation and analysis in pathology (ISAP): public domain image software and source code developed at the Baltimore VA Medical Center.
Am J Clin Pathol. 1994;102:538-539.
http://www.netautopsy.org/ascpisap.htm
The program is U. S. Government work, uncopyrighted, available in the public domain, available in Microsoft® Visual Basic® or Perl source code. See:
Microsoft® Visual Basic® version:
http://www.medparse.com/isapvisb.htm

Perl version:
http://www.medparse.com/isapver2.htm


For a histogram (univariate random variable), divide the distribution at all possible points along the x-axis, and calculate the squares ratio SR as:
SR = {[∑i (Lxi - LX)2] + [∑j (Rxj - RX)2]} / {[∑i (Lxi - X)2] + [∑j (Rxj - X)2]}.
where the Lxi are left-sided histogram-blocks; Rxj are right-sided histogram-blocks; LX is the left-sided mean; RX is the right-sided mean; and X is the grand mean. The minimum squares ratio, SLR determines the best left-right separation of histogram-blocks. Method used for demonstrating the appearance of two populations in experimental studies of murine leukemia.

2. Connect.
SET THEORY/GRAPH THEORY APPROACH
TO MOLECULAR EVOLUTION.



In evolutionary biology, a common ancestor for all animal species, and perhaps for all living species, is inferred from their shared genetic elements, or genes, which in turn give rise to protein products. Therefore, species with a more recent common ancestor can be expected to share more common genes and gene-products than more distantly separated species.

Many of the same mathematical models used in evolutionary theory have applications in the study of cell growth, differentiation, and cancer (=unbounded cell growth).

In mathematical set theory, the shared elements in a pair of sets is the INTERSECTION, denoted , of the sets. Ouchterlony immunodiffusion plates are used to detemine the amount of immunoglobins not shared in the serum for a pair of species. A sparse matrix of such data-elements can be solved for a ranking of species-common-ancestors only if the Leontief matrix is non-singular. Wassily Leontief was awarded a Nobel Prize in Economics in 1973, for his work in sparse matrices of industrial output data, showing how various segments of the industrial economy interact with one another.

Idea that immunoglobin proteins may be regarded as members of a mathematical set; and that the hierarchy (graph) of set-intersections corresponds to molecular evolutionary distance between species. Method used for demonstrating the close molecular relationship between humans and great ape species.


Moore GW.
A Mathematical Model for the Construction of Cladograms.
North Carolina State University. Institute of Statistics. Mimeograph Series No. 731 (1971). Ph.D. Dissertation.
Abstract and Full Text:
http://www.netautopsy.org/mathclad.htm


Method also used for demonstrating the hierarchical distribution of metastases in human cancers.

3. Estimate.
FORMALISM OF SUTTON'S LAW.



Sutton's Law, named after the notorious bank robber, Willie Sutton, is the assertion that in the face of uncertainty, one should choose the most likely alternative ("go where the money is").

Mathematical logic is an appealing formalism in pathology informatics, because of its superficial resemblance to ordinary reasoning, as might be seen in pathology reports: either X or Y is true; both X and Y are true; if X then Y, etc. Even the syntax of logic is similar to that of declarative sentences in natural language.

Logic has the additional advantage over natural language that logic must be consistent: inconsistencies are readily detected by routine computing methods.

In efforts to apply the classical mathematical logic of Aristotle and Boole, one faces a paradox when an unlikely event occurs. That is, one makes a diagnosis and offers therapy based upon incomplete data, which may subsequently be overturned by additional data. In classical logic, Aristotle's Law of Contradiction states that if there is any contradiction in a mathematical system, i.e., a diagnosis that is both true and false, then anything is true. (Latin: Ex Falso Quod Libet; From a contradiction, whatever you please.)

Logically, there is a paradox when the most likely alternative is contradicted by subsequent medical events, including autopsy findings. Formalism of Sutton's law generalizes classical (Aristotelian/Boolean) symbolic logic by removing the Law of Contradiction (Ex Falso Quod Libet; Latin: if contradiction, then anything goes). Akin to fuzzy symbolic logic.

Method used in describing congenital heart malformations, organelle pathology, and gynecologic cytopathology screening.

4. Swap.
DESIGNER CONTINGENCY TABLES:
TOKEN SWAP TEST OF SIGNIFICANCE.



In comparing a new medical test, or HEURISTIC, against an established GOLD-STANDARD, one may collect patient-observations in a 2×2 CONTINGENCY TABLE, also known as a MISCLASSIFICATION MATRIX or CONFUSION MATRIX , This table or matrix (2 rows, 2 columns), has NUMBERS OF PATIENTS listed in each row-column box, or CELL, of the table (Cios, 2006). In the following example:

      Heuristic:
Gold
Standard ↓
NoYes
No650150
Yes15050


there are 650 patients in the upper-left cell; 150 patients in the upper-right cell; 150 patients in the lower-left cell; and 50 patients in the lower-right cell, a total of 1000 patients.

Patients in the upper-left cell and patients in the lower-right cell in this 2×2CT represent agreement between the gold-standard and the heuristic; patients in the upper-right cell represent FALSE POSITIVE PATIENTS (gold-standard-no, heuristic-yes); and patients in the lower-left cell represent FALSE NEGATIVE PATIENTS (gold-standard-yes, heuristic-no). Classically the NULL HYPOTHESIS proposes/asserts that the gold-standard and heuristic are STATISTICALLY INDEPENDENT of one another. REJECTION OF THE NULL HYPOTHESIS suggests that the gold-standard and heuristic are CORRELATED. We consder an ensemble of different DESIGNER NULL HYPOTHESES and a novel TOKEN SWAP MISCLASSIFICATION PARADIGM, which seems more appropriate for medical reasoning.

PREVIOUS TEXT: In its simplest form, a CONTINGENCY TABLE, also known as a MISCLASSIFICATION MATRIX or CONFUSION MATRIX , is a 2×2 table or matrix (2 rows, 2 columns), with NUMBERS OF PATIENTS listed in each row-column box, or CELL, of the table (Cios, 2006). For example, one thousand patients might be distributed as follows:

      Heuristic:
Gold
Standard ↓
NoYes
No650150
Yes15050


This is a 2×2CT with 650 patients in the upper-left cell; 150 patients in the upper-right cell; 150 patients in the lower-left cell; and 50 patients in the lower-right cell, a total of 1000 patients.

A 2×2 contingency table (2×2CT) also known as misclassification matrix or confusion matrix, is a 2×2 rectangular table, whose contents (cells) contain numbers of patients, or tokens. The two rows (no versus yes) correspond to a gold standard, or best possible knowledge with respect to a particular disease; the two columns (no versus yes) correspond to a heuristic test for that disease. In classical statistics, one employs either the chisquare (χ2) contingency test; or the Fisher exact test. Both classical tests have a standard null hypothesis (namely, that the gold standard is completely independent of the heuristic values. In the token swap test of significance, there is no set null hypothesis, and the user may construct a designer null hypothesis to custom-fit a particular medical application.

It may be more transparent/clear to regard each cell as a CONTAINER, or SET, containing PATIENTS (人) (, "ren", is the Chinese ideogram for person) or TOKENS corresponding to patients.
.NoYesTotal
No人人人人
人人人人
人人10
Yes人人人 人人人
人人人
9
Total820010


Method used to examine pain crisis in sickle cell disease.

5. Translate.
COMPUTER TRANSLATION OF PATHOLOGY REPORTS,
INCLUDING BARRIER WORD METHOD:
QUANTITATIVE NATURAL LANGUAGE PROCESSING.



MOORE'S THEORY OF ANATOMIC PATHOLOGY REPORTS states that every well-formed anatomic pathology report has an unambiguous (unique) mapping into a semantic model, that encompasses all possible anatomic pathology reports. The mapping is one-way: many well-formed anatomic pathology reports may map into the same semantic-model-element. The semantic model is a general hierarchy, that includes bodysite, diagnosis (neoplastic, non-neoplastic), differentiation, size, invasion, margins, metastases, and any consultative and/or notification information. By defining a well-formed anatomic pathology report as having a unique semantic-model-element, then, in theory, every well-formed anatomic pathology report has an exact PARSING FORMULA. Subsets of a given parsing formula may also be valid, but a superset parsing formula always supersedes any of its subsets. This superset principle is the basis for both dictionary lookup and the computer parsing algorithm.

PREVIOUS TEXT. One of the long-standing controversies in pathology informatics is whether anatomic pathologists should write their diagnostic reports in free-text (natural language), or should select diagnoses from a system of pick-lists. Computer specialists and researchers have always preferred pick-lists, because they are easier to organize and tabulate. The issue has recently come to widespread attention because of controversial mandates by the College and American Pathologists and the American College of Surgeons for hospital accreditation (Ackerman, 2004; AJCC). In these mandates, pathologists are required to issue SYNOPTIC REPORTS on large specimens resected for cancer therapy. The current documentation /regulation only demands that the required information be present in the reports in some form. But the handwriting is on the wall: regulators are demanding/want more structured reports.
Mene mene teckel upharsin: מנא   מנא   תקל   ופרסין (Dan 5:25). The proverbial handwriting on the wall.
Quantitative natural language processing (QNLP) is computer-translation, using quantitative properties of a natural language (English). It is assumed that any pathology report is unambiguous with respect to a medical semantic model for pathology reports.

6. Harm.
FORMAL MEDICAL ETHICS



The scientific component of diagnostic medicine on the individual patient consists of collecting data on the patient (history, physical examination, laboratory tests, etc.), and inferring a diagnosis and indicated therapy. At each step in the process, the patient must be persuaded that the next step is necessary, and the patient must give consent. Classical mathematical logic may be extended to include additional operators for certainty ($), necessity (#), and attempt (!). Medical investigations on the individual patient should be PROACTIVE (do if you must) and HIPPOCRATIC (first do no harm). That is, if a test or therapy needs to be performed, then it should be attempted; and if attempted, then the attempt should be justified. Formally: ...............

7. Order.
ORDER-LOGIC FOR PATHOGENESIS.



ORDER-LOGIC is the assertion/paradigm: (1) that the entire medical reasoning component of pathology informatics, not including image-recognition, may be expressed in the form of hierarchical tables; and (2) that these tables may be tested for consistency. The first part of this program has been outlined/sketched in various textbooks of anatomic pathology (Sinard, 1996; Haber et al, 2002. The second part is as follows. Let the Hebrew letter, aleph (), represent a LOGICAL ORIGIN, or ULTIMATE PARENT. in a system of hierarchical reasoning. Then every PARENT has one-or-more CHILDREN, including possibly itself, and every child has exactly one parent. In tabular form, the first child of each parent is placed in the cell immediately-below and immediately-right of the parent. Additional children, if present, are placed under the first child, so as not to have intervening blank rows. For example, in this order-logic table, has two children, namely, A and B; parent A has two children, namely, C and D; and parent B has two children, namely, E and F;

..
.A.
..C
..D
.B.
..E
..F


The interpretation of this table in classical logic is:
∧ G-2 ∧ G-1 ∧ G0 ⇒ C1 ∨ C2 ∨ ...
where denotes logical-and; denotes logical-inclusive-or; G0 denotes the parent; G-i denotes the parent of G-i+1 and Ci denotes the ith child of parent G0 .

An order-logic table, satisfies a distributive property of logic:



THEOREM 1.

..
.+A.
..+C
..+D
.+B.
..+C
..+D


is equivalent to

..
.+C.
..+A
..+B
.+D.
..+A
..+B






THEOREM 1. PROOF. The nandsets for the first table of Theorem 1 are: {, -A, -B}, {, +A, -C, -D}, and {, +B, -C, -D}, which imply {, -C, -D}.

The nandsets for the second table of Theorem 1 are: {, -C, -D}, {, +C, -A, -B}, and {, +D, -A, -B}, which imply {, -A, -B}. Therefore, the two tables are equivalent.

A corollary is that

THEOREM 2.

..
.+A.
..+B
..-B
.-A.
..+B
..-B


is vacuous.

THEOREM 2. PROOF. The nandsets for the table of Theorem 2 are: {, -A, +A}, {, +A, +B, -B}, and {, -A, +B, -B}. Every nandset (=Quine's nullity) containing both +X and -X is vacuous. Therefore, the entire table is vacuous.

8. Growth.
INFINITE PAPILLOMA.



Cancer is defined formally as unbounded growth of cells, or more accurately, cell growth bounded by injury to surrounding (invasive) and/or distant (metastatic) tissues. An infinite papilloma is defined formally as unbounded cell growth into a defined area, possibly infinite.

REFERENCES.



1. Ackerman AB.
Protocols for the reporting of cutaneous melanoma.
Am J Clin Pathol. 2004 Nov;122(5):815-817.
Comment in: Am J Clin Pathol. 2004 Nov;122(5):817-818. Discussion 818-819.
PMID: 15540388.
PubMed Entry

2. Cios KJ.
Assessment of the Generated Data Model.
2006;:. in press.

3. Haber MH, Gattuso P, Spitz DJ, David O.
Differential Diagnosis in Surgical Pathology.
Amsterdam: Elsevier Science. 2002;:.
ISBN 0-7216-9053-X, 1150 pages.


4. Sinard JH.
Outlines in Pathology.
New York: W. B. Saunders Company. A Harcourt Health Sciences Company. 1996;:.
ISBN 0-7216-6341-9, 229 pages.

5. American Joint Committee on Cancer.
AJCC Cancer Staging Manual. Sixth Edition.
New York: Springer. 2004;:.
ISBN 0-387-95271-3, 421 pages.

6. Court C.
GMC finds doctors not guilty in consent case.
British Medical Journal. 1995;311:1245-146.

7. Cashwell ED, Everett CJ.
A Practical manual on the Monte Carlo Method for Random Walk Problems.
New York: Pergamon Press. 1959;:.

8. Berman JJ, Moore GW.
The role of cell death in the growth of preneoplastic lesions: a Monte Carlo simulation model.
Cell Prolif. 1992 Nov;25(6):549-557.
PMID: 1457604.
PubMed Entry
Full Text of Article:
http://www.netautopsy.org/celdeath.htm


9. Berman JJ, Moore GW.
Spontaneous regression of residual tumour burden: prediction by Monte Carlo simulation.
Anal Cell Pathol. 1992 Sep;4(5):359-368.
PMID: 1445794.
PubMed Entry
Full Text of Article:
http://www.netautopsy.org/sponregr.htm


10. Moore GW, Berman JJ.
Cell growth simulations predicting polyclonal origins for 'monoclonal' tumors.
Cancer Lett. 1991 Nov;60(2):113-119.
PMID: 1933835.
PubMed Entry
Full Text of Article:
http://www.netautopsy.org/monoclon.htm


Public-domain source code:
http://www.netautopsy.org/monoclon.htm#table1


11. Moore GW, Hutchins GM, Miller RE.
Token swap test of significance for serial medical data bases.
Am J Med. 1986 Feb;80(2):182-190.
PMID: 3511687.
PubMed Entry

12. Moore GW, Hutchins GM, Miller RE.
A new paradigm for hypothesis testing in medicine, with examination of the Neyman Pearson condition.
Theor Med. 1986 Oct;7(3):269-282.
PMID: 3798393.
PubMed Entry

13. Moore GW, Riede UN, Sandritter W.
Application of Quine's nullities to a quantitative organelle pathology.
J Theor Biol. 1977 Apr 21;65(4):633-651.
PMID: 875397.
PubMed Entry

14. Seddon F, ed.
Aristotle & Lukasiewicz on the Principle of Contradiction.
, ed. by Frederick Seddon (Modern Logic, 1996)
ISBN 1884905048

15. Wolenski J, ed.
Philosophical Logic in Poland.
Kluwer. 1994;:.
ISBN 0792322932.

16. Lukasiewicz J.
Elements of Mathematical Logic.
Warsaw: Panstwowe Wydawnictwo Naukowe. 1963;:.
Multi-valued logic was introduced in 1917 by Prof. Jan Lukasiewicz.

17. Lukasiewicz J.
Selected Works.
North-Holland Publishing Co. 1970;:.
ISBN 0720422523.

18. Moore GW, Hutchins GM, Miller RE.
Token swap test of significance for serial medical data bases.
Am J Med. 1986 Feb;80(2):182-190.
PMID: 3511687; UI: 86127353.
PubMed Entry

19. Moore GW, Hutchins GM, Miller RE.
A new paradigm for hypothesis testing in medicine, with examination of the Neyman Pearson condition.
Theor Med. 1986 Oct;7(3):269-282.
PMID: 3798393; UI: 87094863.
PubMed Entry

Last updated: 9/23/2005, by G. William Moore, MD, PhD.