MAINTAINING PATIENT CONFIDENTIALITY
IN THE PUBLIC DOMAIN
INTERNET AUTOPSY DATABASE (IAD).
DRAFT COPY ONLY.
7/10/2009.

Berman JJ, Moore GW, Hutchins GM.
http://www.netautopsy.org/confiden.htm



From the Pathology and Laboratory Medicine Service, Veterans Affairs Maryland Health Care System, Baltimore, Maryland [1]; Department of Pathology, University of Maryland Medical System, Baltimore, Maryland [2]; and Department of Pathology, The Johns Hopkins Medical Institutions, Baltimore, Maryland [3].

Berman JJ, Moore GW, Hutchins GM.
Maintaining patient confidentiality in the public domain Internet Autopsy Database (IAD).
JAMIA (Suppl). 1996;20:328-332.
Proc AMIA Annu Fall Symp. 1996;20:328-332.
PMID: 8947682.
PubMed Entry
Full Text: http://www.netautopsy.org/confiden.htm

Send comments and correspondence to: George.Moore4@va.gov


Related Publications:
Anatomic Pathology Data Mining: http://www.netautopsy.org/apdmchap.htm
Automated Edge Detection, Pathology Images: http://www.netautopsy.org/ascpedge.htm
Fractal Dimensions in Pathology: http://www.netautopsy.org/ascpfrac.htm
Image Segmentation, Analysis: http://www.netautopsy.org/ascpisap.htm
Automated SNOMED Coding: http://www.netautopsy.org/autocode.htm
Anatomic Pathology Procedure Manual: http://www.netautopsy.org/axsop/axsop.htm
Basal Cell Carcinoma, Histologic Discontinuities: http://www.netautopsy.org/basalcel.htm
DNA Analysis, Cardiac Myxoma: http://www.netautopsy.org/camyxoma.htm
Cell Death, Preneoplasia: http://www.netautopsy.org/celdeath.htm
Clear Cell Dysplasia, Bladder: http://www.netautopsy.org/clearcel.htm
Maintaining Patient Confidentiality: http://www.netautopsy.org/confiden.htm
Elevated PSA, African-American Males (Lancet): http://www.netautopsy.org/epsalanc.htm
Elevated PSA, African-American Males (Mod Pathol): http://www.netautopsy.org/epsamopa.htm
Bibliography, Staged Human Embryos: http://www.netautopsy.org/embrbibl.htm
Image Segmentation, Analysis, Pathology (ISAP): http://www.netautopsy.org/isapwlcm.htm
Johns Hopkins Autopsy Resource: http://www.netautopsy.org
Bibliography, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharpubl.htm
Autopsy Report Words, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharaurw.htm
Zipf Distribution, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharzipf.htm
DNA Flow Cytometry, Keratoacanthoma: http://www.netautopsy.org/keratflw.htm
Dysplasia, Atypical Liver Nodule. http://www.netautopsy.org/lvrdyspl.htm
Cell Simulation, Polyclonal Tumors: http://www.netautopsy.org/monoclon.htm
SNOMED-Encoded Surgical Pathology Databases: http://www.netautopsy.org/snomedsp.htm
Pathology Natural Language Processing: http://www.netautopsy.org/natlngpr.htm
Practice Guidelines, Autopsy Pathology: http://www.netautopsy.org/pracguid.htm
Internet Autopsy Database: http://www.netautopsy.org/protoiad.htm
Internet-based Quality Improvement: http://www.netautopsy.org/qimpmopa.htm
Unfunded Research, Pathologists, Internists, Surgeons: http://www.netautopsy.org/unfunded.htm
Uniqueness, Medical Data Mining: http://www.netautopsy.org/uniqmddm.htm
Linguistic Inventory, Johns Hopkins Surgical Pathology: http://www.netautopsy.org/vhpsapsx.htm
Developmental Neoplasm Lineage: http://www.julesberman.info/devclass.htm
Biomedical Informatics: http://www.jbpub.com/catalog/9780763741358/
Neoplasms, Development, Diversity: http://www.jbpub.com/catalog/9780763755706/
Precancer: http://www.jbpub.com/catalog/9780763777845/

Last tested: July 10, 2009.


0. DISCLAIMER.



DISCLAIMER. United States Government Work, uncopyrighted, public-domain, DRAFT COPY ONLY. This document does not necessarily represent the views or policies of any United States Government agency. This document is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the authors be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of, or in connection with the document or the use or other dealings made with the document.



1. ABSTRACT.


The Internet provides the opportunity of permitting public access to large databases containing patient information that can be shared and utilized by epidemiologists, health planners, and medical researchers. Until now, large databases containing patient information have been held in strict confidence, with database access available only to approved researchers or to researchers with access limited to only specific portions of the database. The Internet Autopsy Database (IAD) consists of demographic and pathologic data from over 49,000 autopsies contributed by over a dozen academic medical institutions. Each autopsy record in the public database consists of a uniform set of demographics and SNOMED-compatible terms. To make the database publicly available, a strategy had to be devised that assured the privacy of every person included in the database. A key step involved translating the autopsy facesheets into a listing of SNOMED-compatible terms that effectively eliminated identifying terminology, replacing free text with a generic nomenclature that preserves diagnostic information. The entire database is available on the Internet at:
http://www.netautopsy.org/ .



2. INTRODUCTION.


In 1975, the College of American Pathologists (CAP) proposed to develop a computerized National Autopsy Databank, as a central repository for pathologic, biomedical, demographic, and epidemiologic information, which would potentially benefit a wide range of scientific and research endeavors [1,2,3]. Following a show of interest at a recent CAP conference on autopsy practice [4], an autopsy database has been published on the Internet (URL
http://www.netautopsy.org/ ), and contains data condensed from over 49,000 autopsy facesheets collected from over a dozen academic medical institutions. The autopsy facesheet, also known as FAD or Final Anatomic Diagnoses, is a standard form included in autopsy reports that contains patient demographics and a list of all the pathologic findings from the autopsy.

The Internet Autopsy Database (IAD) is currently hosted by The Johns Hopkins University, Department of Pathology, is in the public domain, and is available for downloading by any institution or individual with the time, resources and imagination to use the data productively. Contributions to the database from medical institutions throughout the world are welcomed.

Public access to these data has several purposes. First, it allows interested researchers from virtually any location on the planet to have equal access to a large autopsy database. Secondly, it exposes researchers who use the database in their publications to the strictest form of review. Anyone wishing to invest the time and energy can scrutinize the data in published studies that derive from the database, and test for repeatability of the results. To our knowledge, this is the first example of a database composed of confidential patient information made available for public examination.

The traditional exclusion of patient-related medical data from public view relates primarily to privacy issues and legal issues that arise from violations of patient confidentiality.

As an illustration, in a recent case described in the British Medical Journal [5], a patient brought charges of professional misconduct against three psychiatrists who described her case in such detail that acquaintances guessed the identity of the patient. The psychiatrists were charged with misconduct and brought before the regulatory body for British physicians. The General Medical Council found the physicians not guilty of professional misconduct, but considered that "the information contained in the paper was such that it enabled Miss C to be identified." It seems prudent to conclude that even when the name of a patient is withheld from a public record, there is still the possibility that the patient may suffer grief resulting from public disclosure of the information. There is an implied fiduciary responsibility for health care workers to insure that information placed in the public domain does not violate patient privacy.


3. METHODS.


On November 25, 1995, a set of over 49,000 autopsy facesheets was made publicly available on the Internet:
http://www.netautopsy.org/ .
Each autopsy facesheet received by the Internet Autopsy Database (IAD) consists of a demographic listing, followed by individual medical terms, representing body-sites, disease-names, morphologies, or modifiers. A typical header for demographic information appears as follows, where <CRLF> denotes carriage-return-line-feed (ASCII 13 then ASCII 10): <CRLF>


      ###123456123456^67Y^W^M^1985^2^NONE<CRLF>

That is, <CRLF> then ### (three consecutive ASCII 35), then a 12-digit autopsy identifier, then age, then race, then gender, then year of autopsy, then location, then occupation, then <CRLF>. The in-line separator character is ^ (ASCII 94). Every facesheet has a unique identifier, consisting of 12 or fewer decimal digits, with no letters or punctuation. The contributor identifying number and the contributor institution are known to the database administrator, but are not disclosed in the public database. A random number generator is used to create a published identifying number, and the mapping between the contributor and published identifying numbers are kept as an off-line file in a physically secure location known to the database administrator.

Age is either SB (=stillborn) or else a decimal number, followed by a single letter denoting units of time. If there are no stated time units, then years is assumed by default. The allowable units are:
H = hours.
D = days.
W = weeks.
M = months (NOT minutes).
Y = years (assumed as default).
Race/ethnicity is, according to the U.S. Public Health Service, as follows:


B = Black, not of Hispanic origin.
W = White, not of Hispanic origin.
H = Hispanic.
A = Asian or Pacific Islander.
I = American Indian or Alaskan Native.
M = Multiracial or other.
Gender is:
F = female.
M = male.
U = undetermined.
Year of autopsy is the four-digit year, NOT abbreviated.

Location code consists of the first digit of the U.S. Postal Service code. For countries outside the USA, the multiple-digit international telephone exchanges (e.g., 044=United Kingdom, 049=Germany, 081=Japan, etc.) are used.

For occupation: English terms, separated by semicolon (;) for multiple occupations. These occupation names are translated into SNOMED-compatible terms before the information is published on the IAD.

There must be a field consisting of a list of pathologic diagnoses verified by the autopsy. Each line in the list is written as short sentences in English with common terminology, and should include an anatomic site. In the published IAD, these short sentences are converted into SNOMED-compatible terms, as a measure for both standardization and anonymity (i.e., any distinctive word usages are blunted by translation into SNOMED; dates and proper names are not translated). The automatic coder is easily confused if a sentence is terminated in an ambiguous or non-standard manner. Clinical history and cause of death, if present, should have a similar syntax, and are likewise translated into SNOMED-compatible terms. Facesheets consist of IBM-compatible, computer-readable files in 7-bit ASCII, i.e., only ASCII characters numbered 10, 13, and 32 through 126. Insofar as possible, lines should be at most 60 characters long, followed by <CRLF>.
      The clinical history portion of the facesheet begins with the text:
<CRLF>CLINICAL HISTORY:<CRLF>.
The anatomical diagnosis portion of the facesheet begins with the text:
<CRLF>ANATOMICAL DIAGNOSIS:<CRLF>.
The cause of death portion of the facesheet begins with the text:
<CRLF>CAUSE OF DEATH:<CRLF>.
Submitted text should be terse, without syntactical complexities, and should end with an unambiguous sentence-terminator. That is, the sentence-terminator should not appear anywhere on the facesheet except at the end of a sentence.

We recommend:
period-space-space
period-carriagereturn-linefeed
semicolon-space
semicolon-carriagereturn-linefeed
IAD records are rendered anonymous so that neither the investigator, the IAD database administrator, nor the contributing institution alone can trace the identity of patients included in the IAD. First, the contributing institution strips or encodes patient identifiers from their submitted records, so that the IAD database administrator cannot know the identiity of the patient. The IAD database administrator then provides a new, encoded identifier for the IAD. The resulting record is anonymous to the institution that contributed the autopsy, as well as to the IAD database administrator and to anyone retrieving the autopsy record from the IAD web page. Anyone desiring further information, glass slides or tissue blocks from a particular case would e-mail the IAD database administrator, identifying the (doubly encoded) IAD autopsy record of interest and his/her research objective. The database administrator then decodes the published record number and restores the contributor record code provided by the contributing institution. The database administrator then forwards the institutionally coded record to the institution. At this point, the institution may decide to do nothing, or to establish a collaboration, with or without divulging the patient's identity, according to its own internal procedures.


4. TRANSLATION PROGRAM.


The computer translation program for converting free-text English diagnoses into corresponding SNOMED diagnoses is based upon the public-domain computer translation program, TRANSOFT [6, M source code provided at IAD website]. In the initial processing, the translator separates the free-text portion of the autopsy facesheet into distinct sentences, using the separators described above, as well as additional terms which often serve as concept separators in an autopsy facesheet, as follows:
by
with
showing
through
demonstrating
consistent with

      Second, the translator expands text fragments which might otherwise be lost in subsequent steps. Ordinarily, numerals and one-letter and two-letter words are removed in subsequent steps, so that essential numerals and words must be preserved through prior expansion. For example, 'no', 'in', and '21' are ordinarily removed, but may be preserved by the following substitutions:
no => negative
in situ => insitu
in vitro => invitro
21 trisomy => twentyonetrisomy
Third, the translator drops all letters in a sentence into lower case; removes all punctuation, numerals, 1-letter and 2-letter words; and removes all stop words, namely, articles, prepositions, conjunctions, common modifiers, and other low-information words [6]. These three steps leave behind a residual free-text, which can more readily be converted into SNOMED-compatible terms.

Finally, the translator attempts a match between a single-word and a corresponding SNOMED-compatible term; then, a match between a two-word term and a corresponding SNOMED-compatible term; then, a match between a three-word term and a corresponding SNOMED-compatible term; until no more matches are possible. The largest successful match is used for translation. Large, unmatched autopsy facesheet sentences are placed on a list for review by the database administrator, who performs a manual match, and updates the translator dictionary. In many cases, the database administrator can make an obvious match between facesheet-free-text and SNOMED-compatible terms, such as inflectional and adjectival forms (cyst, cysts, cystic); or synonyms and common abbreviations (ALS, amyotrophic lateral sclerosis, Lou Gehrig's disease).

In addition, the database administrator can anticipate multiple-word medical phrases which might occur in medical texts, using the barrier word method [4,6,7].



5. TABLE 1A.
SAMPLE AUTOPSY FACESHEET
FOR SUBMISSION TO IAD.


 <CRLF> ###123456123456^67^W^M^1985^2^NONE<CRLF>
 CLINICAL HISTORY:<CRLF> Hypertension.
 Massive cardiomegaly.
 <CRLF> Heart failure.
 <CRLF>
 ANATOMICAL DIAGNOSIS:
 <CRLF>Hypertrophy and dilatation, left ventricular myocardium.
 <CRLF>Generalized atherosclerosis, severe.
 <CRLF>Abdominal visceral congestion.
 <CRLF>Pulmonary congestion.
 <CRLF>Pulmonic artery atherosclerosis.
 <CRLF>Focal pulmonary emphysema.
 <CRLF>Bronchopneumonia.
 <CRLF>Gallstones.
 <CRLF>Benign hyperplasia, prostate.
 <CRLF>Adenomatous polyp, rectum.
 <CRLF>Diverticula, colon.
 <CRLF>



6. TABLE 1B. SAMPLE AUTOPSY FACESHEET, TRANSLATED INTO SNOMED-COMPATIBLE TERMS FOR INCLUSION IN IAD.


 ###54321^67^W^M^1985^2^NONE^
 Hypertensive disease, NOS^ .....
 Massive^ Cardiomegaly^ .....
 Heart failure, NOS^ .....
 Hypertrophy, NOS^ Dilatation, NOS^
 Left^ Ventricle, NOS^
 Myocardium, NOS^ .....
 Generalized^
 Atherosclerosis, NOS^
 Severe^ .....
 Abdominal viscera, NOS^
 Congestion, NOS^ ....
 Pulmonary congestion, NOS^ .....
 Pulmonary artery, NOS^
 Atherosclerosis, NOS^ .....
 Focal^ Pulmonary emphysema, NOS^ .....
 Bronchopneumonia, NOS^ .....
 Biliary calculus, NOS^ .....
 Benign^ Hyperplasia of prostate, NOS^ .....
 Adenomatous polyp, NOS^
 Rectum, NOS^ .....
 Diverticulum, NOS^
 Colon, NOS^ .....



7. RESULTS.


On July 20, 1996, the Internet Autopsy Database consisted of 49,351 autopsy facesheets from over a dozen academic medical institutions. There were 99 files containing autopsy facesheets, comprising 59,455,676 bytes of data. In addition, there were 12 supplementary files containing explanatory materials, translation tables, and search demonstration software (Perl source code included).

Patients ranged in age from stillborn to 112 years old, with autopsy dates ranging from 1889 to 1995. There were 956,272 sentence terminators, 2,905,520 SNOMED-compatible terms and 11,333 distinct (used once or more) SNOMED-compatible terms. A summary of these statistics is given in Table 2.



8. TABLE 2. INTERNET AUTOPSY DATABASE, 7/20/1996.


 Size of database in bytes              59,455,676
 No. of cases                           49,351
 No. of sentence terminators           956,272
 No. of  SNOMED-compatible terms     2,905,520
 No. of unique SNOMED-compatible terms  11,333

Number of patients in each decade...
0 - 9 years 16,425 0 - 19 years 1,839 20 - 29 years 2,665 30 - 39 years 3,833 40 - 49 years 5,412 50 - 59 years. 6,411 60 - 69 years 6,370 70 - 79 years 4,219 80 - 89 years 1,544 90 - 99 years 181 > 99 years 9 age unknown 443



9. DISCUSSION.


The importance of databases composed of anatomic pathology records (surgical pathology report databases and autopsy databases) has been discussed previously [4,7]. Public access to an autopsy database extends beyond the role of the individual autopsy in patient care, to quality assurance, research, and disease surveillance [7]. In addition to studies that might derive wholly from the Internet Autopsy Database, additional studies might also be conducted that compare data from a private database with data from the public database. In other words, hypotheses derived from a single autopsy or from a series of autopsies could be compared with data collected from a large number of similar cases. Since the patient's age, sex, and year of autopsy are provided with each facesheet, the results on a large, potentially biased autopsy sample could be age-adjusted and sex-adjusted by standard epidemiologic methods.

What would be involved in tracing an autopsy record to an individual patient? An autopsy "spy" might know that an individual of a certain age was autopsied in a specific institution on a certain date. The spy wishes to acquire additional, confidential information from the autopsy database. Names of patients and institutions are omitted from the database, and there is no way of knowing whether any particular institution contributes to the IAD. Even if a particular institution were a known contributor to the IAD, there is no way of knowing whether the institution contributes all its autopsies to the IAD or only selects certain types of autopsies. The spy would have to query the database on the three known patient identifiers: the first digit of the U. S. postal zipcode or country code of the institution, the age of the patient, and the year that the autopsy was performed. Although this might reduce the possible matches to a relatively small number, further inquiry would require the spy to have specific pathologic information on the patient that could reduce the size of the matching population. If the spy reached a point where a reasonable guess might be made that an IAD record matches the patient, the spy could never be certain of the match, because the database contains no mechanism to confirm identity. In other words, a spy who holds some confidential information of an individual's autopsy record has a chance of acquiring additional autopsy-related confidential information from the IAD, but the additional information obtained would not be verifiable. The additional, unverifiable information would consist only of a listing of SNOMED-compatible terms, devoid of textual details.

One potential weakness in the confidentiality of the database lies in the mechanism proposed to retrieve tissue from autopsies of scientific interest. For instance, a researcher might wish to embark on a molecular biologic study of tissue samples of a rare neoplasm. Using the publicly available Internet Autopsy Database, he notes that there are 22 autopsies in which this rare lesion was found. Institutions maintain paraffinized tissue blocks of autopsy material that may be suitable for molecular biology studies [8]. The researcher contacts the database administrator (email address available at the IAD website), who forwards the researcher's message to the contributing institution. The researcher might then contact the institution and ask for the tissue blocks of interest, as well as the autopsy report. An unscrupulous person might pose as a researcher to obtain information under false pretenses. Under current guidelines, inquiries to the IAD are all referred to the database administrator, who then contacts the institution(s) that contributed the autopsy facesheets of interest and gives them the name and contact information pertaining to the researcher. The institution then contacts the researcher at its own discretion. Institutions that do not wish to pursue contact need not do so. Institutions that contact the researcher must take any necessary precautions to protect the confidentiality of their patients.

The IAD can be regarded as an experiment into a new era in which patient data records are made available on the Internet. The challenge in developing such databases is to protect patient confidentiality, attract contributors to the database, and to provide data of value to the public.



10. TABLE 3. METHODS FOR MAINTAINING PATIENT CONFIDENTIALITY


1. Encode autopsy/patient identiers by the contributing institution and again by the IAD database administrator, so that each autopsy appears with a doubly encoded identifer number that cannot be linked to a patient by either the IAD database administrator, the contributing institution, or by any user of the IAD.

2. Include autopsy data from a worldwide collection of institutions, and omit the names of the contributing institutions.

3. Identify patient location only as the first digit of the postal zip code (in the case of U.S. autopsies), or as the multiple-digit international telephone exchange in the case of contributions from foreign countries.

4. Use a large database (in excess of 40,000 cases).
     

5. Omit the exact dates of autopsy and ages of patient autopsied (permitting only the age in years and year of autopsy).

6. Omit all free text, restricting pathologic findings to a listing of SNOMED-compatible terms derived from the original autopsy facesheet.



11. REFERENCES.




1. Carter JR, Nash NP, Cechner RL, Platt RD.
Proposal for a national autopsy data bank: a potential major contribution of pathologists to the health care of the nation.
Am J Clin Pathol. 1981 Oct;76(4 Suppl):597-617.
PMID: 7282646
PubMed Entry
Last tested: July 10, 2009.

2. Kircher T, Carter JR, Sinton E.
The National Autopsy Data Bank.
Pathologist. 1985 Nov;39(11):22-26.
PMID: 10274305
PubMed Entry
Last tested: July 10, 2009.

3. Peery TM.
The Autopsy Data Bank: a proposal for pathologists to contribute to the health care of the nation.
Am J Clin Pathol. 1978 Feb;69(2 Suppl):258-259.
PMID: 626172
PubMed Entry
Last tested: July 10, 2009.

4. Moore GW, Berman JJ, Hanzlick RL, Buchino JJ, Hutchins GM.
A prototype Internet autopsy database. 1625 consecutive fetal and neonatal autopsy facesheets spanning 20 years.
Arch Pathol Lab Med. 1996 Aug;120(8):782-785.
PMID: 8718907
PubMed Entry
Last tested: July 10, 2009.

5. Court C.
GMC finds doctors not guilty in consent case.
British Medical Journal. 1995;311:1245-146.

6. Moore GW, Berman JJ.
Object-oriented controlled-vocabulary translator using TRANSOFT + HyperPAD.
Proc Annu Symp Comput Appl Med Care. 1991:973-975.
PMID: 1807773.
PubMed Entry
Last tested: July 10, 2009.

7. Berman JJ, Moore GW.
SNOMED-Encoded surgical pathology databases: a tool for epidemiologic investigation.
Modern Pathol. 1996;9:944-950.

8. Kleiner DE, Emmert-Buck MR, Liotta LA.
Necropsy as a research method in the age of molecular pathology.
Lancet. 1995 Oct 7;346(8980):945-948. Review.
PMID: 7564732
PubMed Entry
Last tested: July 10, 2009.

12. ADDITIONAL READINGS.

1. Coxeter HSM, Greitzer SL.
Geometry Revisited.
New Mathematical Library.
Washington, DC: Math Assn America. 1967;:.
ISBN: 0883856190, 207 pages.

2. Honsberger R.
Episodes in Nineteenth and Twentieth Century Euclidean Geometry.
New Mathematical Library. Washington DC: Math Assn America. 1996. Second printing. 2005 ;:.
ISBN: 0883856395, 174 pages.

3. Coxeter HSM.
Introduction to Geometry.
New York: John Wiley & Sons, Inc. 1961;:.
Library Congress Catalogue # 72-93903.
SBN: 471-18283.


4. Moore GW, Berman JJ.
Cell growth simulations predicting polyclonal origins for 'monoclonal' tumors.
Cancer Lett. 1991 Nov;60(2):113-119.
PMID: 1933835.
PubMed Entry
Full Text: http://www.netautopsy.org/monoclon.htm
Public-domain open-source code: http://www.netautopsy.org/monoclon.htm#table1
Last tested: July 10, 2009.

5. Berman JJ, Moore GW.
Spontaneous regression of residual tumour burden: prediction by Monte Carlo simulation.
Anal Cell Pathol. 1992 Sep;4(5):359-368.
PMID: 1445794.
PubMed Entry
Full Text: http://www.netautopsy.org/sponregr.htm
Last tested: July 10, 2009.

6. Berman JJ, Moore GW.
The role of cell death in the growth of preneoplastic lesions: a Monte Carlo simulation model.
Cell Prolif. 1992 Nov;25(6):549-557.
PMID: 1457604.
PubMed Entry
Full Text: http://www.netautopsy.org/celdeath.htm
Last tested: July 10, 2009.

8. Moore GW, Berman JJ.
Anatomic Pathology Data Mining.
In: Cios KJ, ed. Medical Data Mining and Knowledge Discovery.
2001. XVIII, 502 pp. 98 figs., 98 tabs. Hardcover.
ISBN: 3-7908-1340-0.
Copyright Springer-Verlag: Berlin/Heidelberg 1999.
Full Text: http://www.netautopsy.org/apdmchap.htm
Last tested: July 10, 2009.

9. Berman JJ.
Tumor classification: molecular analysis meets Aristotle.
BMC Cancer. 2004 Mar 17;4:10.
PMID: 15113444
PubMed Entry
Aristotle (384-322 BCE), Greek philosopher.
This article is among the all-time most-viewed articles in BMC Cancer, and, as of September 2008, has been downloaded about 15,000 times from BiomedCentral.
Last tested: July 10, 2009.

10. Berman JJ.
Tumor taxonomy for the developmental lineage classification of neoplasms.
BMC Cancer. 2004 Nov 30;4(1):88.
PMID: 15571625.
PubMed Entry
Last tested: July 10, 2009.

11. Berman JJ.
Modern classification of neoplasms: reconciling differences between morphologic and molecular approaches.
BMC Cancer 2005, 5:100.
PMID: 16092965
PubMed Entry
Last tested: July 10, 2009.

12. Berman JJ.
Developmental Lineage Classification and Taxonomy of Neoplasms.
http://www.julesberman.info/devclass.htm
Last tested: July 10, 2009.

13. Berman JJ.
Doublet method for very fast autocoding.
BMC Med Inform Decis Mak. 2004 Sep 15;4:16.
PMID: 15369595
PubMed Entry
Last tested: July 10, 2009.

14. Berman JJ.
Resource page.
http://www.julesberman.info/resource.htm
Last tested: July 10, 2009.

15. Berman JJ, Moore GW.
Implementing an RDF schema for pathology images.
http://www.julesberman.info/spec2img.htm
Last tested: July 10, 2009.

16. Berman JJ.
Chronology of Earth.
http://www.julesberman.info/chronos.htm
Last tested: July 10, 2009.

17. Berman JJ.
Biomedical Informatics.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers; 1 edition (October 18, 2006)
ISBN-10: 0763741353, 459 pages.
ISBN-13: 978-0763741358, 459 pages.
http://www.jbpub.com/catalog/9780763741358/
http://www.julesberman.info/
Last tested: July 10, 2009.

18. Berman JJ.
Perl Pogramming for Medicine and Biology.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers; 1 edition (April 6, 2007)
ISBN-10: 076374333X, 407 pages.
ISBN-13: 978-0763743338, 407 pages.
http://www.jbpub.com/catalog/9780763743338/
http://www.julesberman.info/
Last tested: July 10, 2009.

19. Berman JJ.
Perl: The Programming Language.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2009;:.
ISBN: 9780763757588, 52 pages.
http://www.jbpub.com/catalog/9780763757588/
http://www.julesberman.info/
Last tested: July 10, 2009.

20. Berman JJ.
Ruby Programming for Medicine and Biology.
Boston, Toronto, London, Singapore: Jones & Bartlett Pub; 1 edition (September 13, 2007)
ISBN-10: 0763750905, 378 pages.
ISBN-13: 978-0763750909, 378 pages.
http://www.jbpub.com/catalog/9780763750909/
http://www.julesberman.info/
Last tested: July 10, 2009.

21. Berman JJ.
Ruby: The Programming Language.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2009;:.
ISBN: 9780763757571, 46 pages.
http://www.jbpub.com/catalog/9780763757571/
Last tested: July 10, 2009.

22. Berman JJ.
Neoplasms: Principles of Development and Diversity.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2008 Oct 1.
ISBN: 9780763755706, 464 pages.
http://www.jbpub.com/catalog/9780763755706/
Last tested: July 10, 2009.

23. Berman JJ, with Moore GW.
Precancer: The Beginning and the End of Cancer.
Boston, Toronto, London, Singapore: Jones and Bartlett. 2009 Aug 11;:.
ISBN 9780763777845, 200 pages.
http://www.jbpub.com/catalog/9780763777845/
Last tested: July 10, 2009.

24. Berman JJ.
Web site: http://www.julesberman.info/
Last tested: July 10, 2009.

25. Berman JJ.
Blog site: http://julesberman.blogspot.com/
Last tested: July 10, 2009.

26. Hanahan D, Weinberg RA.
The hallmarks of cancer.
Cell 2000;100:57-70.

27. Kansal AR, Torquato S, Harsh GR IV, Chiocca EA, Deisboeck TS.
Simulated brain tumor growth dynamics using a three-dimensional cellular automaton.
J Theor Biol. 2000 Apr 21;203(4):367-382.
PMID: 10736214.
PubMed Entry
Last tested: July 10, 2009.

28. Kansal AR, Torquato S, Chiocca EA, Deisboeck TS.
Emergence of a subpopulation in a computational model of tumor growth.
J Theor Biol. 2000 Dec 7;207(3):431-441.
PMID: 11082311
PubMed Entry
Last tested: July 10, 2009.

29. Kansal AR, Torquato S.
Globally and locally minimal weight spanning tree networks.
Physica A. 2001;301:601-619.

30. Kansal AR, Trimmer J.
Application of predictive biosimulation within pharmaceutical clinical development: examples of significance for translational medicine and clinical trial design.
Syst Biol (Stevenage). 2005 Dec;152(4):214-220.
PMID: 16986263
PubMed Entry
Last tested: July 10, 2009.

31. Kansal AR.
Modeling approaches to type 2 diabetes.
Diabetes Technol Ther. 2004 Feb;6(1):39-47. Review.
PMID: 15000768.
PubMed Entry
Last tested: July 10, 2009.

32. Kansal AR, Torquato S, Stillinger FH.
Diversity of order and densities in jammed hard-particle packings.
Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Oct;66(4 Pt 1):041109. Epub 2002 Oct 24.
PMID: 12443179.
PubMed Entry
Last tested: July 10, 2009.

33. Deisboeck TS, Berens ME, Kansal AR, Torquato S, Stemmer-Rachamimov AO, Chiocca EA.
Pattern of self-organization in tumour systems: complex growth dynamics in a novel brain tumour spheroid model.
Cell Prolif. 2001 Apr;34(2):115-134.
PMID: 11348426
PubMed Entry
Last tested: July 10, 2009.

34. Kansal AR, Torquato S, Harsh IV GR, Chiocca EA, Deisboeck TS.
Cellular automaton of idealized brain tumor growth dynamics.
Biosystems. 2000 Feb;55(1-3):119-127.
PMID: 10745115
PubMed Entry
Last tested: July 10, 2009.

35. Schmitz JE, et al.
A cellular automaton model of brain tumor treatment and resistance.
J Theor Medicine. 2002(4):223-239.

36. Holash J, et al.
Vessel cooption, regression, and growth in tumors mediated by angiopoietins and VEGF.
Science 1999;284: 1994-1998.

37. Helmlinger G, et al.
Solid stress inhibits the growth of multicellular tumor spheroids.
Nature Biotech. 1997;15:778-783.

38. Kitano H.
Cancer as a robust system: implications for anticancer therapy.
Nat Rev Cancer. 2004 Mar;4(3):227-235. Review.
PMID: 14993904.
PubMed Entry
Last tested: July 10, 2009.

39. Kitano H, Oda K, Kimura T, Matsuoka Y, Csete M, Doyle J, Muramatsu M.
Metabolic syndrome and robustness tradeoffs.
Diabetes. 2004 Dec;53 Suppl 3:S6-S15. Review.
PMID: 15561923.
PubMed Entry
Last tested: July 10, 2009.

40. Kitano H.
Biological robustness.
Nat Rev Genet. 2004 Nov;5(11):826-37. Review.
PMID: 15520792.
PubMed Entry
Last tested: July 10, 2009.

41. Kyoda K, Baba K, Onami S, Kitano H.
DBRF-MEGN method: an algorithm for deducing minimum equivalent gene networks from large-scale gene expression profiles of gene deletion mutants.
Bioinformatics. 2004 Nov 1;20(16):2662-75. Epub 2004 May 27.
PMID: 15166016.
PubMed Entry
Last tested: July 10, 2009.

42. Kitano H.
Cancer robustness: tumour tactics.
Nature. 2003 Nov 13;426(6963):125.
PMID: 14614483.
PubMed Entry
Last tested: July 10, 2009.

43. Gevertz JL, Gillies GT, Torquato S.
Simulating tumor growth in confined heterogeneous environments.
Phys Biol. 2008 Sep 29;5(3):36010.
PMID: 18824788.
PubMed Entry
Last tested: July 10, 2009.

44. Gevertz JL, Torquato S.
A novel three-phase model of brain tissue microstructure.
PLoS Comput Biol. 2008 Aug 15;4(8):e1000152.
PMID: 18704170.
PubMed Entry
Last tested: July 10, 2009.

45. Gevertz JL, Torquato S.
Modeling the effects of vasculature evolution on early brain tumor growth.
J Theor Biol. 2006 Dec 21;243(4):517-531. Epub 2006 Jul 15.
PMID: 16938311.
PubMed Entry
Last tested: July 10, 2009.

46. Conway JH, Torquato S.
Packing, tiling, and covering with tetrahedra.
Proc Natl Acad Sci U S A. 2006 Jul 11;103(28):10612-10617. Epub 2006 Jul 3.
PMID: 16818891.
PubMed Entry
Last tested: July 10, 2009.

47. Deisboeck TS, Berens ME, Kansal AR, Torquato S, Stemmer-Rachamimov AO, Chiocca EA.
Pattern of self-organization in tumour systems: complex growth dynamics in a novel brain tumour spheroid model.
Cell Prolif. 2001 Apr;34(2):115-134.
PMID: 11348426.
PubMed Entry
Last tested: July 10, 2009.

48. Wilimas JA, Dow LW, Douglass EC, Jenkins JJ 3rd, Jacobson RJ, Moohr J, Fialkow PJ.
Evidence for clonal development of Wilms' tumor.
Am J Pediatr Hematol Oncol. 1991 Spring;13(1):26-28.
PMID: 1851399.
PubMed Entry
Last tested: July 10, 2009.

49. Reddy AL, Fialkow PJ.
Evidence that weak promotion of carcinogen-initiated cells prevents their progression to malignancy.
Carcinogenesis. 1990 Dec;11(12):2123-2126.
PMID: 2124950.
PubMed Entry
Last tested: July 10, 2009.

50. Fialkow PJ.
Stem cell origin of human myeloid blood cell neoplasms.
Verh Dtsch Ges Pathol. 1990;74:43-47. Review.
PMID: 1708632.
PubMed Entry
Last tested: July 10, 2009.

51. Fialkow PJ, Singer JW, Raskind WH, Adamson JW, Jacobson RJ, Bernstein ID, Dow LW, Najfeld V, Veith R.
Clonal development, stem-cell differentiation, and clinical remissions in acute nonlymphocytic leukemia.
N Engl J Med. 1987 Aug 20;317(8):468-473.
PMID: 3614291.
PubMed Entry
Last tested: July 10, 2009.

52. Jacobson RJ, Temple MJ, Singer JW, Raskind W, Powell J, Fialkow PJ.
A clonal complete remission in a patient with acute nonlymphocytic leukemia originating in a multipotent stem cell.
N Engl J Med. 1984 Jun 7;310(23):1513-1517.
PMID: 6717542.
PubMed Entry
Last tested: July 10, 2009.

53. Moulton-Levy P, Jackson CE, Levy HG, Fialkow PJ.
Multiple cell origin of traumatically induced keloids.
J Am Acad Dermatol. 1984 Jun;10(6):986-8.
PMID: 6736343.
PubMed Entry
Last tested: July 10, 2009.

54. Reddy AL, Fialkow PJ.
Papillomas induced by initiation-promotion differ from those induced by carcinogen alone.
Nature. 1983 Jul 7-13;304(5921):69-71.
PMID: 6408484.
PubMed Entry
Last tested: July 10, 2009.

55. Fialkow PJ, Singer JW, Adamson JW, Berkow RL, Friedman JM, Jacobson RJ, Moohr JW.
Acute nonlymphocytic leukemia: expression in cells restricted to granulocytic and monocytic differentiation.
N Engl J Med. 1979 Jul 5;301(1):1-5.
PMID: 286882.
PubMed Entry
Last tested: July 10, 2009.

56. Fialkow PJ.
Clonal origin of human tumors.
Annu Rev Med. 1979;30:135-143. Review.
PMID: 400484.
PubMed Entry
Last tested: July 10, 2009.

57. Fialkow PJ, Najfeld V, Reddy AL, Singer J, Steinmann L.
Chronic lymphocytic leukaemia: Clonal origin in a committed B-lymphocyte progenitor.
Lancet. 1978 Aug 26;2(8087):444-6.
PMID: 79806.
PubMed Entry
Last tested: July 10, 2009.

58. Adamson JW, Fialkow PJ.
The pathogenesis of myeloproliferative syndromes.
Br J Haematol. 1978 Mar;38(3):299-303. Review.
PMID: 346048.
PubMed Entry
Last tested: July 10, 2009.

59. Fialkow PJ, Jackson CE, Block MA, Greenawald KA.
Multicellular origin of parathyroid "adenomas".
N Engl J Med. 1977 Sep 29;297(13):696-698.
PMID: 895789.
PubMed Entry
Last tested: July 10, 2009.

60. Fialkow PJ, Jacobson RJ, Papayannopoulou T.
Chronic myelocytic leukemia: clonal origin in a stem cell common to the granulocyte, erythrocyte, platelet and monocyte/macrophage.
Am J Med. 1977 Jul;63(1):125-130.
PMID: 267431.
PubMed Entry
Last tested: July 10, 2009.

61. Adamson JW, Fialkow PJ, Murphy S, Prchal JF, Steinmann L.
Polycythemia vera: stem-cell and probable clonal origin of the disease.
N Engl J Med. 1976 Oct 21;295(17):913-916.
PMID: 967201.
PubMed Entry
Last tested: July 10, 2009.

62. Barr RD, Fialkow PJ.
Clonal origin of chronic myelocytic leukemia.
N Engl J Med. 1973 Aug 9;289(6):307-309.
PMID: 4515677.
PubMed Entry
Last tested: July 10, 2009.

63. Fialkow PJ, Klein G, Clifford P.
Second malignant clone underlying a Burkitt-tumor exacerbation.
Lancet. 1972 Sep 23;2(7778):629-631.
PMID: 4116779.
PubMed Entry
Last tested: July 10, 2009.

64. Fialkow PJ.
Single or multiple cell origin for tumors?
N Engl J Med. 1971 Nov 18;285(21):1198-1199.
PMID: 5096643.
PubMed Entry
Last tested: July 10, 2009.

65. Fialkow PJ.
Is lyonisation total in man?
Lancet. 1970 Aug 8;2(7667):315.
PMID: 4194398.
PubMed Entry
Last tested: July 10, 2009.

66. Fialkow PJ, Klein G, Gartler SM, Clifford P.
Clonal origin for individual Burkitt tumours.
Lancet. 1970 Feb 21;1(7643):384-386.
PMID: 4189689.
PubMed Entry
Last tested: July 10, 2009.

67. Carter JR.
The office of decedent affairs.
JAMA. 1992 Jan 8;267(2):235-236.
PMID: 1727517
PubMed Entry
Last tested: July 10, 2009.

68. Carter JR.
The problematic death certificate.
N Engl J Med. 1985 Nov 14;313(20):1285-1286.
PMID: 4058510
PubMed Entry
Last tested: July 10, 2009.

69. Kircher T, Carter JR, Sinton E.
The National Autopsy Data Bank.
Pathologist. 1985 Nov;39(11):22-26.
PMID: 10274305
PubMed Entry
Last tested: July 10, 2009.

70. Carter JR, Nash NP, Cechner RL, Platt RD.
Proposal for a national autopsy data bank: a potential major contribution of pathologists to the health care of the nation.
Am J Clin Pathol. 1981 Oct;76(4 Suppl):597-617.
PMID: 7282646
PubMed Entry
Last tested: July 10, 2009.

71. Carter JR.
National autopsy data bank: potentially useful to so many, for so little.
Pathologist. 1981 Oct;35(10):548-553.
PMID: 10253253
PubMed Entry
Last tested: July 10, 2009.

72. Carter JR.
A renascence role of anatomic pathology in modern medicine.
Hum Pathol. 1977 May;8(3):237-241.
PMID: 323134
PubMed Entry
Last tested: July 10, 2009.

73. Cechner RL, Carter JR.
Storage and retrieval of SNOP-coded pathologic diagnoses using offsite computing and optical character recognizing systems.
Am J Clin Pathol. 1976 May;65(5):654-61.
PMID: 16535807
PubMed Entry
Last tested: July 10, 2009.

74. Peery TM.
The Autopsy Data Bank: a proposal for pathologists to contribute to the health care of the nation.
Am J Clin Pathol. 1978 Feb;69(2 Suppl):258-259.
PMID: 626172
PubMed Entry
Last tested: July 10, 2009.

75. Williams MJ, Peery TM.
The autopsy, a beginning, not an end.
Am J Clin Pathol. 1978 Feb;69(2 Suppl):215-216.
PMID: 626160
PubMed Entry
Last tested: July 10, 2009.

76. Moore GW, Hutchins GM.
The persistent importance of autopsies.
Mayo Clin Proc. 2000 Jun;75(6):557-558.
PMID: 10852414
PubMed Entry
Last tested: July 10, 2009.

77. Hutchins GM, Berman JJ, Moore GW, Hanzlick R.
Practice guidelines for autopsy pathology: autopsy reporting. Autopsy Committee of the College of American Pathologists.
Arch Pathol Lab Med. 1999 Nov;123(11):1085-92.
PMID: 10539932
PubMed Entry
Last tested: July 10, 2009.

78. Berman JJ, Moore GW, Hutchins GM.
Internet autopsy database.
Hum Pathol. 1997 Apr;28(4):393-394.
PMID: 9104935
PubMed Entry
Last tested: July 10, 2009.

79. Moore GW, Berman JJ, Hanzlick RL, Buchino JJ, Hutchins GM.
A prototype Internet autopsy database. 1625 consecutive fetal and neonatal autopsy facesheets spanning 20 years.
Arch Pathol Lab Med. 1996 Aug;120(8):782-785.
PMID: 8718907
PubMed Entry
Last tested: July 10, 2009.

80. Berman JJ, Moore GW, Hutchins GM.
Maintaining patient confidentiality in the public domain Internet Autopsy Database (IAD).
Proc AMIA Annu Fall Symp. 1996:328-332.
PMID: 8947682
PubMed Entry
Last tested: July 10, 2009.

81. Baumann RP, Moore GW.
[Comparison of 2 series of autopsies observed at Johns-Hopkins Medical Center, Baltimore (JHMI) and at the Neuchâtel Institute of Pathology (INAP)]
Schweiz Med Wochenschr. 1990 Dec 8;120(49):1876-1879. [French].
PMID: 2263930
PubMed Entry
Last tested: July 10, 2009.

82. Moore GW, Miller RE, Hutchins GM.
Determining cause of death in 45,564 autopsy reports.
Theor Med. 1988 Jun;9(2):179-186.
PMID: 3413706
PubMed Entry
Last tested: July 10, 2009.

83. Moore GW, Boitnott JK, Miller RE, Eggleston JC, Hutchins GM.
Integrated pathology reporting, indexing, and retrieval system using natural language diagnoses.
Mod Pathol. 1988 Jan;1(1):44-50.
PMID: 3070549
PubMed Entry
Last tested: July 10, 2009.

84. Moore GW, Hutchins GM, Miller RE.
Strategies for searching medical natural language text. Distribution of words in the anatomic diagnoses of 7000 autopsy subjects.
Am J Pathol. 1984 Apr;115(1):36-41.
PMID: 6546837
PubMed Entry
Last tested: July 10, 2009.

85. Kleiner DE, Emmert-Buck MR, Liotta LA.
Necropsy as a research method in the age of molecular pathology.
Lancet. 1995 Oct 7;346(8980):945-948. Review.
PMID: 7564732
PubMed Entry
Last tested: July 10, 2009.

86. Hall PA, Lemoine NR.
Comparison of manual data coding errors in two hospitals.
J Clin Pathol. 1986 Jun;39(6):622-626.
PMID: 3722414
PubMed Entry
Last tested: July 10, 2009.

87. Coles EC, Slavin G.
An evaluation of automatic coding of surgical pathology reports.
J Clin Pathol. 1976 Jul;29(7):621-625.
PMID: 977772
PubMed Entry
Last tested: July 10, 2009.

88. Earlam R.
Körner, nomenclature, and SNOMED.
Br Med J (Clin Res Ed). 1988 Mar 26;296(6626):903-905.
PMID: 3129068
PubMed Entry
Last tested: July 10, 2009.

89. Dodd W.
Körner, nomenclature, and SNOMED.
Br Med J (Clin Res Ed). 1988 Apr 23;296(6630):1198-1199.
PMID: 3132268
PubMed Entry
Last tested: July 10, 2009.

90. Earlam R.
Surgical audit in a district general hospital: a stimulus for improving patient care.
Ann R Coll Surg Engl. 1987 Sep;69(5):251-252.
PMID: 3674693
PubMed Entry
Last tested: July 10, 2009.

91. Brannigan VM.
Software quality regulation under the Safe Medical Devices Act of 1990: hospitals are now the canaries in the software mine.
Proc Annu Symp Comput Appl Med Care. 1991:238-242.
PMID: 1807596
PubMed Entry
Last tested: July 10, 2009.

92. Moore GW, Berman JJ.
Object-oriented controlled-vocabulary translator using TRANSOFT + HyperPAD.
Proc Annu Symp Comput Appl Med Care. 1991:973-975.
PMID: 1807773.
PubMed Entry
Last tested: July 10, 2009.

93. Stuart-Buttle CD, Brown PJ, Price C, O'Neil M, Read JD.
The Read Thesaurus--creation and beyond.
Stud Health Technol Inform. 1997;43 Pt A:416-420.
PMID: 10184896
PubMed Entry
Last tested: July 10, 2009.

94. Stuart-Buttle CD, Read JD, Sanderson HF, Sutton YM.
A language of health in action: Read Codes, classifications and groupings.
Proc AMIA Annu Fall Symp. 1996:75-79.
PMID: 8947631
PubMed Entry
Last tested: July 10, 2009.

95. Read JD, Sanderson HF, Drennan YM.
Terming, encoding, and grouping.
Medinfo. 1995;8 Pt 1:56-59.
PMID: 8591263
PubMed Entry
Last tested: July 10, 2009.



Last updated: 7/10/2009, by G. William Moore, MD, PhD.