UNIFIED MEDICAL LANGUAGE SYSTEM METATHESAURUS (UMLS-M) OF THE U. S. NATIONAL LIBRARY OF MEDICINE (USNLM). MOST COMPREHENSIVE, PUBLICLY-AVAILABLE LIST OF STANDARDIZED MEDICAL TERMINOLOGY IN THE WORLD. WHAT IS THE CONCORDANCE RATE FOR GENERAL PATHOLOGY AND EMBRYOLOGY TEXT?
SINARD'S OUTLINES IN PATHOLOGY. 25 CHAPTERS, POPULAR REVIEW TEXT FOR PATHOLOGY BOARDS. STREETER'S DEVELOPMENTAL HORIZONS IN HUMAN EMBRYOS AND RELATED REFERENCES. ALL MAJOR AREAS OF ANATOMIC PATHOLOGY ARE COVERED. COMPUTER-ENCODED INTO UMLS, WITH ENRICHED SYNONYM LIST. CONCORDANCE: MEDICALLY-SIGNIFICANT TERM, PRESENT IN THE TEXTBOOK, AND ALSO CAPTURED BY ENCODING PROGRAM. FALSE NEGATIVE: UMLS CONCEPT NOT PRESENT. AMBIGUOUS TERMS AND COMPOUND TERMS CONTAINING SUBCONCEPTS INDEXED REDUNDANTLY.
UNIFIED MEDICAL LANGUAGE SYSTEM (UMLS) : DEVELOPED BY U.S. NATIONAL LIBRARY OF MEDICINE (USNLM) IN 1986. PURPOSE: AID DEVELOPMENT OF SYSTEMS TO RETRIEVE ELECTRONIC BIOMEDICAL INFORMATION. http://www.nlm.nih.gov/research/umls/ LAST UPDATED: January 1, 2000. METATHESAURUS SIZE: 113,699,627 BYTES. CONCEPT UNIQUE IDENTIFIERS (CUIs): 729,248, MAX=C0813178, RETIRED=83,930. SYNONYMS: 1,598,176 OVER 50 SOURCE-VOCABULARIES. OVER 20 PARTIAL TRANSLATIONS INTO FOREIGN LANGUAGES.
CELLULAR BLUE NEVUS (C0334448). BLUE NEVUS (C0206736). CELL (C0007634). BLUE (C0332584). NEVUS (C0027960).
NATURAL-LANGUAGE MEDICAL TEXT: SEQUENCE OF MEDICAL CONCEPTS SEPARATED BY GRAMMATICAL OBJECTS. THE GRAMMATICAL OBJECTS, OR BARRIER WORDS: NUMERALS, PUNCTUATION, SINGLE LETTERS, ARTICLES, PREPOSITIONS, AND COMMON VERBS AND MODIFIERS. MEDICAL CONCEPTS, OR KEYWORDS: ARE ONE-WORD OR MULTIPLE-WORD TERMS CONSISTING OF MEDICALLY SIGNIFICANT WORDS.
LICHEN SIMPLEX CHRONICUS . CHRONIC FORM of any of above with IRRITATION and TRAUMA . EPIDERMIS undergoes a PSORIASIFORM THICKENING but with an increased THICKNESS of the GRANULAR LAYER . SCARRING and BROADENING of DERMAL PAPILLAE .barrier words in lower case. KEYWORDS IN UPPER CASE.
TEXT NAME UMLS CUI
LICHEN SIMPLEX CHRONICUS C0149922
CHRONIC FORM C0205179 C0376315
of C0456627
any C0205392*
of C0456627
above C0205103
with C0332287
IRRITATION C0441718
and C0332287*
TRAUMA C0548346
EPIDERMIS C0014518
undergoes
a C0205447*
PSORIASIFORM THICKENING C0033860* C0332527
but C0332287*
with C0332287
an C0205447*
increased C0205216
thickness C2005400
of C0456627
the C0205435*
GRANULAR LAYER C0205247 C0205274*
SCARRING C0036287
and C0332287*
broadening C0332464*
of C0456627
DERMAL PAPILLAE C0221927 C0205312*
GUT TRACT and its DERIVATIVES . at this same time the PHARYNGEAL POUCHES , which heretofore have been relatively simple LATERAL EXPANSIONS of GUT EPITHELIUM intervening between the AORTIC ARCHES , are taking the form of SPECIALIZED STRUCTURES . one can RECOGNIZE the beginning TRANSFORMATION into an AUDITORY TUBE and TYMPANUM , also the PRIMORDIA of the THYMUS , LATERAL THYROID , and SUPERIOR and INFERIOR PARATHYROID GLANDS.barrier words in lower case. KEYWORDS IN UPPER CASE.
TEXT NAME UMLS CUI
GUT TRACT C0699818 C0332208*
and C0332287*
its C0027344*
DERIVATIVES C0243070
at C0332285*
this C0205435*
same C0445243
time C0040213
the C0205435*
PHARYNGEAL POUCHES C0231067*
which C0043237*
heretofore C0332152*
have C0605770*
been C0392148*
relatively C0205345*
simple C0205347
LATERAL EXPANSIONS C0205091 C0205229*
of C0456627
GUT EPITHELIUM C0699818 C0014603
intervening C0205102
between C0205102
the C0205435*
AORTIC ARCHES C0442005
are C0392148*
taking
the C0205435*
form C0376315
of C0456627
SPECIALIZED STRUCTURES C0205548 C0678594*
one C0205429
can C0808716
RECOGNIZE C0524637*
the C0205435*
beginning C0439657
TRANSFORMATION C0040682
into C0332285
an C0205447*
AUDITORY TUBE C0439822 C0175730
and C0332287*
TYMPANUM C0242251
also C0332287*
the C0205435*
PRIMORDIA C0678727*
of C0456627
the C0205435*
THYMUS C0496916
LATERAL THYROID C0205091 C0795756
and C0332287*
SUPERIOR C0205103
and C0332287*
INFERIOR PARATHYROID GLANDS C0678975 C0030518 C0225352
ADNEXA WITHOUT NEARBY DISAMBIGUATING WORD: SKIN ADNEXA (C0221943) UTERINE ADNEXA (C0001575) OCULAR ADNEXA (C0229243)
PATHOLOGY INPUT TEXT: 951 KB, 25 CHAPTERS. 120,677 WORDS, 11,240 DISTINCT WORDS, AVERAGE: 10.7 = 120,677/11,240 OCCURRENCES PER WORD. 77,498 (64.2%) EXACT MATCHES TO A UMLS SYNONYM, 33,348 (27.6%) ADDITIONAL, APPROXIMATE MATCHES TO UMLS CUIS, 8.1% UNMATCHED CONCEPTS.
INPUT TEXT: 1.26 MB. 110,314 WORDS, 9,087 DISTINCT WORDS. 5,323 (4.8%) MISSPELLINGS (OPTICAL MISTRANSLATIONS). AVERAGE: 12.1 = 110,314/9,087 OCCURRENCES PER WORD. AMONG CORRECTLY SPELLED WORDS: 48,758 (46.4%) EXACT MATCHES TO A UMLS SYNONYM; 46,250 (44.0%) ADDITIONAL, APPROXIMATE MATCHES TO UMLS CUIS. 9.5% UNMATCHED CONCEPTS.
CONCORDANCE RATE: 90.9%. UNMATCHED CONCEPTS TENDED TO BE DESCRIPTIVE TERMS IN PATHOLOGY THAT CHARACTERIZE MICROSCOPIC FINDINGS. UMLS IS A HIGHLY INCLUSIVE CONCEPT SYSTEM FOR PATHOLOGY. HOWEVER, UMLS IS SYNONYM-POOR. MANY SYNONYMS MUST BE ADDED MANUALLY. UMLS: NEARLY-COMPREHENSIVE METATHESAURUS FOR PATHOLOGY TEXT.
LEXICAL VARIANTS: NUCLEI ==> CELL NUCLEUS. OBVIOUS SYNONYMS: CLUSTER ==> AGGREGATE. OBVIOUS MISSPELLINGS: WILM'S ==> WILMS'.
BRONCHITS ==> BRONCHITIS.OBVIOUS CONTRACTIONS: ADDISON ==> ADDISON'S DISEASE.
CUSHING ==> CUSHING'S DISEASE.
SQUAMOUS ==> SQUAMOUS CELL.COMPOUNDS: WITHOUT ==> NEGATIVE-WITH.
RANK FREQUENCY WORD UMLS CUI
1 3,950 of C0456627
2 2,591 in C0439203
3 2,387 and C0332287*
4 1,873 with C0332287
5 1,779 to C0332285*
6 1,562 the C0205435*
7 1,297 or C0332270*
8 1,256 cells C0007625
9 904 usually C0332183*
10 899 cell C0007634
11 847 may C0806904
12 711 be C0014121
13 682 by C0336807
14 681 most C0205381
15 604 are C0392148*
16 537 common C0205213
17 521 is C0441912
18 469 often C0332181
19 446 can C0808716
20 439 tumor C0027651
21 435 for C0521117
22 433 small C0700320
23 418 from C0332285*
24 406 disease C0012633
25 384 but C0332287*
26 383 carcinoma C0007095
27 369 not C0205160*
28 364 more C0205171
29 358 seen C0205395
30 344 tumors C0027651
31 334 large C0549176
32 333 type C0332307
33 322 aka C0332287*
34 321 have C0605770*
35 307 at C0332285*
36 298 on C0332285*
37 294 as C0003818
38 269 which C0043237*
39 267 no C0205160*
40 263 tissue C0040300
41 254 patients C0030704
42 248 malignant C0205282
43 245 present C0392743
44 242 associated C0004083*
45 240 also C0332287*
46 236 chronic C0205179
47 234 all C0444867
48 232 lesions C0221198
49 226 prognosis C0220901
50 222 age C0001774
RANK FREQUENCY WORD
1 48 cord
2 33 still
3 30 eventually
4 30 need
5 28 cords
6 27 palisading
7 27 particularly
8 26 must
9 25 plump
10 24 polygonal
11 24 represent
12 24 sharply
13 23 counterpart
14 23 germ
15 23 host
16 22 immunoblastic
17 22 remain
18 22 should
19 22 undergo
20 21 mantle
21 21 parenchyma
22 19 just
23 19 prone
24 19 unclear
25 18 villi
26 17 subendothelial
27 16 amino
28 16 arranged
29 16 background
30 16 excellent
31 16 intrahepatic
32 16 odontogenic
33 15 excess
34 15 glans
35 15 goblet
36 15 half
37 15 untreated
38 15 villous
39 14 independent
40 14 laden
41 14 outflow
42 14 subtypes
43 13 bundles
44 13 entity
45 13 extrahepatic
46 13 granulomatosis
47 13 intracytoplasmic
48 13 invariably
49 13 oncocytic
50 13 perineural
RANK FREQUENCY TERM UMLS CUI
1 479 of the C0332285*
2 244 in the C0332285*
3 196 associated with C0332281
4 101 due to C0678226
5 86 giant cells C0017526
6 82 plasma cells C0032112
7 78 smooth muscle C0026843
8 54 tumor cells C0431085
9 50 type i C0441729
10 49 autosomal dominant C0443147
11 49 clear cell C0229473
12 49 well differentiated C0205615
13 44 into the C0332285*
14 44 type ii C0441730
15 41 soft tissue C0225317
16 39 gi tract C0017189
17 38 germinal centers C0282491
18 38 low grade C0205080
19 37 absence of C0332197
20 37 autosomal recessive C0441748
21 35 connective tissue C0009780
22 35 squamous metaplasia C0025570
23 34 spindle cell C0682540
24 34 squamous cell C0221910
25 33 bile ducts C0005400
26 33 within the C0332285*
27 32 chronic inflammation C0021376
28 32 from the C0332285*
29 31 foci of C0205234
30 31 poor prognosis C0278252
31 31 rather than C0489693*
32 31 well defined C0442825
33 30 differential diagnosis C0220820
34 30 giant cell C0017526
35 28 basement membrane C0004799
36 28 good prognosis C0278250
37 28 high grade C0205082
38 28 renal failure C0035078
39 27 bone marrow C0005953
40 27 in situ C0444498
41 26 bile duct C0005400
42 26 lymph nodes C0154054
43 26squamous cell carcinoma C0007137
44 25 rheumatoid arthritis C0003873
45 24 cell type C0449475
46 24 soft tissues C0225317
47 23 but also C0332287*
48 22 blood vessels C0005847
49 22 inflammatory cells C0440752
50 22 of these C0332285*
RANK FREQUENCY WORD UMLS CUI
1 10,394 the C0205435*
2 5,441 of C0456627
3 3,574 in C0439203
4 3,123 and C0332287*
5 1,982 to C0332285*
6 1,947 is C0441912
7 1,042 that C0205435*
8 959 are C0392148*
9 947 by C0336807
10 919 as C0003818
11 919 be C0014121
12 903 it C0027361*
13 756 this C0205435*
14 695 from C0332285*
15 602 which C0043237*
16 597 mm C0439266
17 560 with C0332287
18 545 at C0332285*
19 541 embryos C0013935
20 534 cells C0007625
21 533 no C0205160*
22 515 age C0001774
23 505 embryo C0013932
24 438 group C0441832
25 422 stage C0684248
26 417 one C0205429
27 404 an C0205447*
28 403 for C0521117
29 389 its C0027344*
30 388 or C0332270*
31 372 has C0605674
32 348 on C0332285*
33 339 not C0205160*
34 330 been C0392148*
35 302 these C0205392*
36 296 form C0376315
37 295 more C0205171
38 294 can C0808716
39 294 fig C0349932
40 288 human C0020102
41 287 have C0605770*
42 269 embryonic C0521444
43 269 plate C0005971
44 268 specimens C0370003*
45 247 figure C0441469*
46 240 into C0332285
47 238 their C0027361*
48 234 was C0392148*
49 233 primitive C0033153*
50 230 shown C0332265*
RANK FREQUENCY WORD
1 101 stalk
2 52 way
3 48 anat
4 48 germ
5 46 pit
6 44 wash
7 43 prechordal
8 43 until
9 42 could
10 42 pairs
11 39 order
12 37 presumed
13 36 ones
14 36 taken
15 35 bars
16 34 cord
17 33 profile
18 32 come
19 32 shell
20 31 free
21 30 chordal
22 30 example
23 29 details
24 29 passage
25 29 polar
26 28 neuropore
27 28 sharply
28 27 consists
29 27 how
30 27 intercellular
31 27 lacunae
32 27 takes
33 27 tubal
34 26 epiblast
35 26 instead
36 25 detail
37 25 folds
38 25 just
39 25 meso
40 25 partly
41 24 gelatinous
42 24 manner
43 24 owing
44 23 cited
45 23 conspicuous
46 23 quite
47 22 field
48 22 particular
49 22 particularly
50 22 proper
RANK FREQUENCY TERM UMLS CUI
1 2,613 of the C0332285*
2 1,037 in the C0332285*
3 346 from the C0332285*
4 267 age group C0596048
5 147 has been C0392148*
6 141 of this C0332285*
7 127 for the C0521125*
8 123 yolk sac C0042893
9 110 embryonic disc C0231003
10 100 primitive streak C0033153
11 93 into the C0332285*
12 78 have been C0392148*
13 78 there is C0332287*
14 69 through the C0332273*
15 59 chorionic cavity C0230966
16 58 between the C0205103*
17 55 of these C0332285*
18 49 there are C0332287*
19 47 age groups C0027362
20 38 nervous system C0027763
21 36 neural tube C0231024
22 33 sinus venosus C0231084
23 32 the other C0205394*
24 30 central nervous system C0007679
25 29 chorionic villi C0008508
26 29 within the C0332285*
27 28 blood vessels C0005847
28 27 stage 5 C0441777
29 27 zona pellucida C0043519
30 26 referred to C0205543
31 24 in addition C0332287*
32 24 over the C0205136*
33 22 cloacal membrane C0231056
34 22 vascular system C0489903
35 20 along the C0205428*
36 19 due to C0678226
37 19 in addition to C0332287
38 18 site of C0449643
39 18 stage 2 C0441767
40 18 stage 3 C0441771
41 17 about the C0475806*
42 17 under the C0542339*
43 16 amniotic cavity C0230976
44 16 as well as C0332287*
45 16 rather than C0489693*
46 15 blood cells C0005773
47 14 chick embryo C0008046
48 14 germ cells C0017471
49 14 in vitro C0021135
50 14 of that C0332285*