Journal of Pathology Informatics Journal of Pathology Informatics
Contact us | Home | Login   |  Users Online: 1057  Print this pageEmail this pageSmall font sizeDefault font sizeIncrease font size 

Table of Contents    
J Pathol Inform 2012,  3:23

The feasibility of using natural language processing to extract clinical information from breast pathology reports

1 Department of Surgical Oncology, Massachusetts General Hospital, Boston, Massachusetts, USA
2 Department of Surgical Oncology, Dana Farber Cancer Institute, Boston, Massachusetts, USA
3 Department of Surgical Pathology, Massachusetts General Hospital, Boston, Massachusetts, USA

Date of Submission20-Dec-2011
Date of Acceptance22-May-2012
Date of Web Publication30-Jun-2012

Correspondence Address:
Kevin S Hughes
Department of Surgical Oncology, Massachusetts General Hospital, Boston, Massachusetts
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/2153-3539.97788

Rights and Permissions

Objective: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. Approach and Procedure: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. Results: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. Conclusion: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.

Keywords: Breast pathology reports, clinical decision support, natural language processing

How to cite this article:
Buckley JM, Coopey SB, Sharko J, Polubriaginof F, Drohan B, Belli AK, Kim EM, Garber JE, Smith BL, Gadd MA, Specht MC, Roche CA, Gudewicz TM, Hughes KS. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J Pathol Inform 2012;3:23

How to cite this URL:
Buckley JM, Coopey SB, Sharko J, Polubriaginof F, Drohan B, Belli AK, Kim EM, Garber JE, Smith BL, Gadd MA, Specht MC, Roche CA, Gudewicz TM, Hughes KS. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J Pathol Inform [serial online] 2012 [cited 2021 Sep 19];3:23. Available from:

   Background and Significance Top

The promise that the Electronic Health Record (EHR) will increase quality while decreasing cost is largely dependent on widespread integration of computerized Clinical Decision Support (CDS). CDS systems apply algorithms and guidelines to the patient data to help determine the diagnosis and/or the best course of action, and then present that result to the clinician and the patient in a visualization that makes it easy to understand and that stimulates action. [1] The caveat is that CDS systems require data that are both structured and machine readable. As the vast majority of data in the EHR are free text there is currently little opportunity to institute CDS systems into clinical practice.

The simplest, but most time consuming approach to unlocking the data in free text is to have an expert read and interpret each report. While this approach works relatively well in the day-to-day care of individual patients, it is impractical when attempting real-time CDS on all patients seen at an institution or when undertaking a retrospective review of tens of thousands of cases.

To consider one such situation, pathology reports contain tremendously valuable data regarding the clinical situation of the patient. These reports are almost always written in a free text format. While synoptic reporting in some anatomic pathology systems have made an effort in the right direction to provide discreet data elements, there are still comment/note sections that allow result verbiage with free text.

Natural language processing (NLP) software has been designed to convert free text into machine readable, structured data. While NLP has been touted as a solution to the problem, this approach is not nearly as simple or effective as it may sound. The inherent linguistic and structural variability within any body of free text poses a significant challenge to efficient retrieval of data.

   Objective Top

As a proof of principle of the utility, but also of the difficulty, of using NLP to decipher breast pathology reports, we undertook the creation of a database of results from breast pathology reports at the Massachusetts General Hospital (MGH), The Brigham and Women's Hospital (BWH) and Newton-Wellesley Hospital (NWH). Our goal was to identify which specimens had evidence of any, or all of a number of diagnoses of interest.

   Approach and Procedure Top

With the approval of the Partners Institutional Review Board (IRB), all electronically available pathology reports from MGH, BWH, and NWH between 1987 and 2010 that involved breast tissue were identified from the Research Data Repository which holds pathology report data from all institutions. International Classification of Diseases -9 (ICD-9) and Current Procedural Terminology (CPT) codes were used to identify those reports pertaining to breast.

We determined that the most important diagnoses for our study that might be found, either alone or in combination, within a pathology report were invasive ductal cancer (IDC); invasive lobular cancer (ILC); invasive cancer NOS; ductal carcinoma in situ (DCIS); severe atypical ductal hyperplasia (severe ADH); lobular carcinoma in situ (LCIS); atypical lobular hyperplasia (ALH); atypical ductal hyperplasia (ADH); and benign. As a preparatory step, a folder or "bucket" was then created for each diagnosis within the NLP software (Clearforest, Waltham, MA). A "bucket" would eventually hold a set of words and/or phrases that denoted a specific pathology report. Next, the layout of the pathology report was analyzed. The most important information pertaining to the diagnosis was contained in a section labeled "Final Diagnosis," which was present for each distinct specimen (a report might have an excision and four shaved margins as five distinct specimens for the same side with the potential for different diagnoses by specimen) [Figure 1]. Thus there was often more than one final diagnosis in a single pathology report on a single day. We identified both the start and the end of the final diagnosis section for each specimen and these sections were parsed out and associated with a Medical Record Number (MRN), Date, and Side [Figure 1] and [Figure 2]. Parsing techniques varied by institution, due to the unique, institution specific report layouts. Using NLP software (Clearforest, Waltham, MA), the "Final Diagnosis" section of a test set of 500 reports from each institution was processed. The NLP software displayed all words and phrases in these reports, and the number of times each was used in this set of reports, and provided an interface to associate each word or phrase with one or more of the "buckets." Each entity generated by the software was then associated with the "bucket" it represented. For example, the entities "infiltrating ductal carcinoma," "invasive cancer with ductal features," "invasive cancer, ductal type," etc. all went into the "invasive ductal cancer" bucket. Some entities went into more than one bucket, such as "invasive carcinoma with both ductal and lobular features," which was both IDC and ILC. This approach was then applied to the larger data set, to test its functionality and to identify words or phrases missed in the test set.
Figure 1: Sample pathology report showing the fields extracted (highlighted in bold type). Each specimen was parsed separately and generated its own "final diagnosis"

Click here to view
Figure 2: Sample datasheet displaying extracted diagnostic information from the sample report shown in Figure 1. As each specimen generated its own "final diagnosis," a single row was created for each specimen by MRN, date, side and specimen in the first of three databases created

Click here to view

It was also identified that an entity may be negated, and the negation might lie either before or after the text. For example, a report may state that there was "no evidence of invasive carcinoma," or "residual DCIS was not seen." All words and phrases that denoted negation and their order in the sentence (before or after the diagnostic entity) were identified and placed in pre- and postnegation categories. A pattern was then created to recognize negation. If an entity was negated, it was not recorded in the final data set for that record.

The multiple ways of saying each entity were counted as well as the multiple ways of stating negation.

A single row in an Access (Microsoft) table was created for each specimen, where the presence of each entity in that specimen was recorded in the appropriate column. This initial table had a row for each MRN, date, side, and specimen and denoted all diagnoses present in that specimen on that date [Figure 2].

Each of these "final diagnoses" from a single date and side were amalgamated into a single row in a second table that denoted an MRN, date, side, and all diagnoses on that date.

We then identified a "maximum diagnosis" on each date by establishing a trumping order (an "order of significance"), such that IDC, ILC, or invasive cancer NOS, would outweigh DCIS, which would outweigh severe ADH, which would outweigh LCIS, which would outweigh ALH, which would outweigh ADH, which would outweigh benign.

Where multiple surgeries occurred in the course of treating the same problem on a given side, such as re-excisions for positive margins, we considered these a single "episode of care" for that patient; thus all pathology results from a single side within a 6-month time frame were amalgamated into a third Table organized by MRN, Date, Side, and Episode. Pathology reports outside this 6-month period or from the opposite side were considered as separate episodes.

The most significant or "maximum" diagnosis was taken as the primary diagnosis, with the others listed as secondary diagnoses in each Table.

Thus, NLP created three data Tables, MRN, Date, Side, Specimen, which separately listed all diagnoses from each specimen on a given day, "MRN, Date, Side, Summary Diagnoses" which summarized all diagnoses from a given day and an "MRN, Side, Episode of Care, Diagnoses" Table which summarized all diagnoses from a given episode.

As our first study was conducted to identify patients with high risk lesions, we opted to review a nonrandom sample of 6,711 pathology reports which were identified in patients who had a diagnosis of severe ADH, LCIS, ALH or ADH, without prior or concurrent cancer. These NLP results were reviewed by human coders who compared the result to the free text report to determine the accuracy of the NLP. The accuracy for the maximum diagnosis and the accuracy for all diagnoses were recorded separately.

   Results Top

In 76,333 breast pathology reports, multiple entities were identified that represented each of the significant buckets. Excluding typographical errors and spacing errors, we identified 124 ways of saying invasive ductal cancer; 95 ways of saying invasive lobular cancer; 52 ways of saying DCIS; 14 ways of saying severe ADH; 53 ways of saying lobular carcinoma in situ; 17 ways of saying atypical lobular hyperplasia and 14 ways of saying atypical ductal hyperplasia [Table 1]. Examples of ways to describe ADH and invasive carcinoma are shown in [Table 2] and [Table 3].
Table 1: The number of ways in which each diagnosis was said in pathology reports

Click here to view
Table 2: Different ways in which pathologists describe the presence of atypical ductal hyperplasia

Click here to view
Table 3: Some examples of the 95 ways of saying "invasive lobular carcinoma"

Click here to view

In addition, we identified 21 ways of negating a diagnosis when the words appeared before the diagnosis (e.g., No evidence of invasive ductal carcinoma), and an additional 12 ways of negating the diagnosis when the words fell after the diagnosis (e.g., ADH was not seen). As each entity can potentially be negated by a pre- or postnegative one, must multiply the number of ways of stating the negation by the number of ways of describing that particular diagnostic entity. For example, with invasive ductal cancer; that means 124 ways of saying IDC multiplied by 33 ways of saying "not" gives a total of 4092 potential ways to say IDC was not present.

When the processor output was compared to reports as reviewed by expert human coders, 97% of reports were correct for all diagnoses and 97.8% were correct for the maximum diagnosis. [Figure 3] demonstrates examples of incorrect diagnoses, where the software did not identify diagnoses that were present in the report. Most commonly this occurred because the diagnosis was written in a pattern not recognized by the software, or simply a typographical error in the report.
Figure 3: Sample datasheet showing examples of missed diagnoses by the software. In row 1, "atypical hyperplasia" was not associated with either "ductal" or "lobular" and thus was not a pattern recognized by the software. In rows 2 and 3, the way in which "atypical ductal hyperplasia" was written was not a pattern recognized by the software. In row 3, typographical errors in the spelling of "carcinoma" meant the presence of DCIS was not detected by the processor

Click here to view

To calculate the sensitivity, specificity, and predictive value of NLP, we considered "all diagnoses." A true positive was defined as atypia present, correctly identified with NLP. A false positive was defined as atypia identified by NLP that was not present on the report. A true negative was defined as a benign diagnosis, correctly identified with NLP, and a false negative was defined as atypia present, but not identified by NLP. The sensitivity of NLP to correctly identify all diagnoses was 99.1%, with a specificity of 96.53%. The positive predictive value of NLP was 98.63% and the negative predictive value was 97.73%.

   Discussion Top

This study highlights one of the principal difficulties encountered in utilizing electronic data beyond its specific context. While a breast pathology report written in free text is easily read and interpreted on an individual patient basis, it is thus far not feasible to use information in this format in conjunction with other computerized systems, such as CDS systems. Pathology reports are long and contain multiple sections, which are frequently a mixture of free text and tabular data. Relevant clinical data may be contained within any section or format. Before NLP could be utilized to extract meaningful data, we first had to write a program to enable us to extract the sections which contained the data we were interested in. Due to the complexity of the task, we had to consciously exclude some valuable parts of the report, such as the "Note" section, where the pathologist might elaborate about alternative diagnoses.

Having extracted the text from the sections of interest to us, we then found that there was significant variability in the way that breast diagnoses can be expressed. Sentence structure and different descriptive phraseology accounted for the majority of differences. This is clearly illustrated by the large numbers of synonymous terms encountered (e.g. 124 phrases describing invasive ductal carcinoma, 95 phrases describing invasive lobular carcinoma). Previous studies have highlighted the issue of context or semantics as being one of the inherent difficulties with using NLP to derive specific clinical information from large body of free text reports. [2] When medical notes or reports are read by medical personnel, the person automatically applies their own knowledge of both medicine and the clinical condition of the patient to interpret the report correctly, even when the vocabulary and grammar differ between reporting physicians. To overcome this obstacle, the approach of identifying all possible ways a given entity might be represented, creating patterns, and then using the NLP software to identify these entities in each report maximized the potential for NLP to correctly extract the information of interest.

Even more striking is the very high number of potential ways of stating negation of each entity. The use of the negative, and its position either before or after the diagnosis of interest, required the creation of a further set of patterns to facilitate processor recognition of the negative, thereby excluding negated phrases from the final data set.

For breast pathology reports, the frequent co-existence of several diagnoses within the same report such as "atypical ductal hyperplasia, ductal carcinoma in situ, and invasive ductal carcinoma" adds a further layer of complexity to the task of extracting interpretable data. We overcame this by assigning a level of importance to each diagnosis such that the final diagnosis was that which was most significant; however, each of the other diagnoses was also extracted and entered in the final database entry for that patient. In our study we used the concept of "episodes of care," where several reports from a single surgery were compressed into a single episode of care, represented by a single entry on our final datasheet. Any pathology report from that patient outside of 6 months was considered as a "new episode." Patients may then be followed over time, according to their "episodes" of care. Extraction and storage of data in this fashion facilitate future correlative and longitudinal studies examining the natural history of pathologic entities of interest such as LCIS and ADH.

In this study we have used natural language processing to extract data from free text breast pathology reports and organize them into a format which could be more easily utilized for statistical analysis. NLP can be defined as a "theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications." [3]

The utility of NLP has previously been tested in a number of studies in medicine. Elkins et al. used NLP to code neuroradiology reports, and reported processor accuracy of 84% compared to 86% accuracy of human coders. [4] Hripcsak et al. developed a natural language processor MedLEE, and used it to code over 800,000 chest radiograph reports demonstrating a processor sensitivity of 0.81 and a specificity of 0.99 when compared to expert human coders. [5] This group has also examined the use of NLP systems to identify drug interactions and adverse drug events from the electronic medical record, as well as comparing current disease-specific drug prescribing with recommendations in the published medical literature. [2],[6]

Pathology reports present a unique field of study in natural language processing. They contain a large amount of valuable clinical information in a patient's care pathway. The coding of free text pathology reports has been attempted with varying degrees of success since the study done by Pratt et al. in 1978. [7] The potential difficulties in extracting information from surgical pathology reports were highlighted by Liu et al., who concluded that some variables are better "targets" for extraction than others. Staging and grading of cancer appeared to be particularly difficult to auto annotate. [8] We agreed with their assessment, and did not attempt to extract stage or grade in this study. Friedman and Xu found that tabular data and a lack of punctuation, combined with information on multiple specimens, made breast pathology reports difficult to process directly using NLP. [9] We were able to work around this difficulty by a multistep process that looked at specimens individually, and then by episode. Subsequent work from this group integrated the use of a preprocessor with an existing NLP system to overcome these issues, and reported combined system sensitivity of 90.6% and specificity of 91.6% compared with human coders. [10] In our study, our data also had to be preprocessed using a specifically written program to identify the regions of interest in each pathology report. We have demonstrated a low overall error rate of 2.22%, when the processor was compared to human coders. Sensitivity of 99.1% and specificity of 96.53% is significantly better than that quoted in other studies using similar technologies.

During the 1970s and 1980s, the use of computer technology to provide decision making support to clinicians was widely tested with generally favorable results. [11] Advances in technological capabilities, along with a switch to electronic health records, have seen a resurgence in interest in computer aided, clinical decision support (CDS). Osheroff et al. defined the goal of CDS as a tool "to provide the right information, to the right person, in the right format, through the right channels, at the right point in workflow to improve health and health care decisions and outcomes." [1] The most common CDS systems currently in use are drug-drug interaction programs and allergy programs. CDS is potentially far more widely applicable, facilitating clinician and patient access to the latest scientific evidence and practice guidelines. However, a large amount of clinical information still exists in an unstructured, nonstandardized format, which significantly limits the utility of CDS. In our current study we present an example of how a large body of free text medical information, in this instance, breast pathology reports, can be converted to a machine readable format. Storing clinical data in this way facilitates access to CDS systems which can potentially provide up - to-date information to clinicians and patients on risk assessment, screening, and management.

   Conclusion Top

We have created a large database of valuable clinical information from over 76, 000 breast pathology reports. While we have demonstrated the utility of NLP, we have also been struck by the inherent complexity of using NLP in medical care. The time and effort required to use NLP for a single, well-defined problem should give pause to the idea that having data in any electronic format, even free text, will help us improve medical care. The design of Electronic Medical Records that use structured data and depend less and less on free text is critical.

   Acknowledgements Top

The authors declare that they have no competing interests.

   References Top

1.Osheroff JA, Teich JM, Middleton B, Steen EB, Wright A, Detmer DE. A Roadmap for national action on clinical decision support. J Am Med Inform Assoc 2007;14:141-5.  Back to cited text no. 1
2.Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of disease drug knowledge from biomedical and clinical documents: An initial study. J Am Med Inform Assoc 2008;15:87-98.  Back to cited text no. 2
3.Liddy ED. Natural Language Processing. In: Drake MA, editor. Encyclopedia of library and information science. 2nd ed. New York: Marcel Decker Inc; 2001.  Back to cited text no. 3
4.Elkins JS, Friedman C, Boden-Albala B, Sacco RL, Hripcsak G. Coding neuroradiology reports for the Northern Manhattan Stroke Study: A comparison of natural language processing and manual review. Comput Biomed Res 2000;33:1-10.  Back to cited text no. 4
5.Hripcsak G, Austin JH, Alderson PO, Friedman C. Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 2002;224:157-63.  Back to cited text no. 5
6.Chen ES, Stetson PD, Lussier YA, Markatou M, Hripcsak G, Friedman C. Detection of practice pattern trends through Natural Language Processing of clinical narratives and biomedical literature. AMIA Annu Symp Proc 2007;11:120-4.  Back to cited text no. 6
7.Dunham SG, Pacak GM, Pratt WA. Automatic indexing of pathology data. J Am Soc Inf Sci 1978;29:81-90.  Back to cited text no. 7
8.Liu K, Mitchell KJ, Chapman WW, Crowley RS. Automating tissue bank annotation from pathology reports - Comparison to a gold standard expert annotation set. AMIA Annu Symp Proc 2005;460-4.  Back to cited text no. 8
9.Xu H, Friedman C. Facilitating research in pathology using natural language processing. AMIA Annu Symp Proc 2003;1057.  Back to cited text no. 9
10.Xu H, Anderson K, Grann VR, Friedman C. Facilitating cancer research using natural language processing of pathology reports. Stud Health Technol Inform 2004;107(Pt 1):565-72.  Back to cited text no. 10
11.Cooper JG, West RM, Clamp SE, Hassan TB. Does computer-aided clinical decision support improve the management of acute abdominal pain? A systematic review. Emerg Med J 2011;28:553-7.  Back to cited text no. 11


  [Figure 1], [Figure 2], [Figure 3]

  [Table 1], [Table 2], [Table 3]

This article has been cited by
1 Electronic Health Record Phenotypes for Precision Medicine: Perspectives and Caveats From Treatment of Breast Cancer at a Single Institution
Matthew K. Breitenstein,Hongfang Liu,Kara N. Maxwell,Jyotishman Pathak,Rui Zhang
Clinical and Translational Science. 2018; 11(1): 85
[Pubmed] | [DOI]
2 Machine learning to parse breast pathology reports in Chinese
Rong Tang,Lizhi Ouyang,Clara Li,Yue He,Molly Griffin,Alphonse Taghian,Barbara Smith,Adam Yala,Regina Barzilay,Kevin Hughes
Breast Cancer Research and Treatment. 2018;
[Pubmed] | [DOI]
3 Development of a Natural Language Processing Engine to Generate Bladder Cancer Pathology Data for Health Services Research
Florian R. Schroeck,Olga V. Patterson,Patrick R. Alba,Erik A. Pattison,John D. Seigne,Scott L. DuVall,Douglas J. Robertson,Brenda Sirovich,Philip P. Goodney
Urology. 2017;
[Pubmed] | [DOI]
4 Tabular Versus Synoptic Reporting of Prostate Core Needle Biopsies
Andrew A. Renshaw,Mercy Mena-Allauca,Edwin W. Gould
JCO Clinical Cancer Informatics. 2017; (1): 1
[Pubmed] | [DOI]
5 Natural Language Processing Systems for Capturing and Standardizing Unstructured Clinical Information: a systematic review
Kory Kreimeyer,Matthew Foster,Abhishek Pandey,Nina Arya,Gwendolyn Halford,Sandra F Jones,Richard Forshee,Mark Walderhaug,Taxiarchis Botsis
Journal of Biomedical Informatics. 2017;
[Pubmed] | [DOI]
6 Natural language processing in pathology: a scoping review
Gerard Burger,Ameen Abu-Hanna,Nicolette de Keizer,Ronald Cornet
Journal of Clinical Pathology. 2016; 69(11): 949
[Pubmed] | [DOI]
7 Adequacy of physician documentation and correlation with assessment of transfusion appropriateness: a follow-up study in the setting of prospective audits and patient blood management
Emilio Madrigal,Shyam Prajapati,Vaidehi Avadhani,Kyle Annen,Mark T. Friedman
Transfusion. 2016;
[Pubmed] | [DOI]
8 Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research
M. Löpprich,F. Krauss,M. Ganzinger,K. Senghas,S. Riezler,P. Knaup
Methods of Information in Medicine. 2016; 55(4)
[Pubmed] | [DOI]
9 Using machine learning to parse breast pathology reports
Adam Yala,Regina Barzilay,Laura Salama,Molly Griffin,Grace Sollender,Aditya Bardia,Constance Lehman,Julliette M. Buckley,Suzanne B. Coopey,Fernanda Polubriaginof,Judy E. Garber,Barbara L. Smith,Michele A. Gadd,Michelle C. Specht,Thomas M. Gudewicz,Anthony J. Guidi,Alphonse Taghian,Kevin S. Hughes
Breast Cancer Research and Treatment. 2016;
[Pubmed] | [DOI]
10 Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods
Tejal A. Patel,Mamta Puppala,Richard O. Ogunti,Joe E. Ensor,Tiancheng He,Jitesh B. Shewale,Donna P. Ankerst,Virginia G. Kaklamani,Angel A. Rodriguez,Stephen T. C. Wong,Jenny C. Chang
Cancer. 2016;
[Pubmed] | [DOI]
11 Cancer Biomarkers: The Role of Structured Data Reporting
Ross W. Simpson,Michael A. Berman,Philip R. Foulis,Dimitrios X. G. Divaris,George G. Birdsong,Jaleh Mirza,Richard Moldwin,Samantha Spencer,John R. Srigley,Patrick L. Fitzgibbons
Archives of Pathology & Laboratory Medicine. 2015; 139(5): 587
[Pubmed] | [DOI]
12 Evaluation of an Automated Information Extraction Tool for Imaging Data Elements to Populate a Breast Cancer Screening Registry
Ronilda Lacson,Kimberly Harris,Phyllis Brawarsky,Tor D. Tosteson,Tracy Onega,Anna N. A. Tosteson,Abby Kaye,Irina Gonzalez,Robyn Birdwell,Jennifer S. Haas
Journal of Digital Imaging. 2015; 28(5): 567
[Pubmed] | [DOI]
13 Issues in reporting cytology: From headers and critical values to categorical data and natural language parsers
Andrew A. Renshaw,George G. Birdsong
Journal of the American Society of Cytopathology. 2014; 3(1): 37
[Pubmed] | [DOI]
14 Text mining of cancer-related information: Review of current status and future directions
Irena Spasic,Jacqueline Livsey,John A. Keane,Goran Nenadic
International Journal of Medical Informatics. 2014; 83(9): 605
[Pubmed] | [DOI]
15 University of California, Irvine–Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports
Naveen Ashish,Lisa Dahm,Charles Boicey
Health Informatics Journal. 2014; 20(4): 288
[Pubmed] | [DOI]
16 The role of chemoprevention in modifying the risk of breast cancer in women with atypical breast lesions
Suzanne B. Coopey,Emanuele Mazzola,Julliette M. Buckley,John Sharko,Ahmet K. Belli,Elizabeth M. H. Kim,Fernanda Polubriaginof,Giovanni Parmigiani,Judy E. Garber,Barbara L. Smith,Michele A. Gadd,Michelle C. Specht,Anthony J. Guidi,Constance A. Roche,Kevin S. Hughes
Breast Cancer Research and Treatment. 2012; 136(3): 627
[Pubmed] | [DOI]




   Browse articles
    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Access Statistics
    Email Alert *
    Add to My List *
* Registration required (free)  

  In this article
    Background and S...
    Approach and Pro...
    Article Figures
    Article Tables

 Article Access Statistics
    PDF Downloaded1123    
    Comments [Add]    
    Cited by others 16    

Recommend this journal