From CCI
Satya Sahoo
Assistant Professor
Division of Medical Informatics and Electrical Engineering Computer Science Department
Office: WRB 6126
Phone: 216-368-3286
Fax: 216-368-0207
Email: satya dot sahoo at case dot edu
Dr. Sahoo's research interest include development and application of computer science methods and technologies for:
1. Patient information capture at the point of care (more information)
2. Large scale data integration for translational research spanning bench side to patient bedside (more information)
3. Natural language processing (NLP) of clinical free text for secondary use of clinical data (e.g. patient cohort studies) (more information)
4. Matching patients to clinical trials using biomedical ontologies.
His computer science research involves Knowledge Representation and Reasoning (e.g. Ontology engineering), Semantic Web, Provenance metadata management, Data Integration, and Database Theory. He has also worked on Query Optimization and Indexing techniques, Scientific Workflows, and (Semantic) Web services.
Dr. Sahoo received his bachelors degree in Mathematical Statistics (honors program) from the University of Delhi and Ph.D. in Computer Science and Engineering from the Kno.e.sis Center. He is an Invited Expert in the World Wide Web Consortium (W3C) Provenance Working Group and a recipient of the 2012-13 Glennan Fellowship award from the CWRU University Center for Innovation in Teaching and Education.
News
1. ProvBench 2013: A community-based repository for provenance benchmarking (more information)
2. The W3C Provenance Working Group specifications, including the PROV Ontology (PROV-O), are now Proposed Recommendations
Publications
Google Scholar Profile (with citation score)
List Format
Research Projects
Ontology-driven Clinical Free Text Analysis
Processing of clinical free text is extremely challenging. Epilepsy Data Extraction and Annotation (EpiDEA) is an ontology-driven clinical free text processing platform that uses the Epilepsy and Seizure ontology (EpSO) as the core knowledge resource for processing, representing, and querying of clinical text. By extending the cTAKES natural language processing tool developed at the Mayo Clinic, EpiDEA addresses the unique challenges of epilepsy and seizure-related clinical free text in patient discharge summaries. EpiDEA also incorporates a visual interface for cohort identification over data extracted from the discharge summaries that can be directly used by clinical researchers. More information
Funding: The PRISM (Prevention and Risk Identification of SUDEP Mortality) Project (1-P20-NS076965-01)
Provenance Framework for Proteomics Data Management
Provenance is contextual metadata that facilitates effective data integration, reproducibility of results, correct attribution of original source, and answering queries involving “What”, “Where”, “When”, “Which”, “Who”, “How”, and “Why”. The SemPoD project is creating an integrated query environment for accessing and analyzing experiment data using well known experiment reporting guidelines (e.g. MIMIx, MIAPE) as search criteria. SemPoD uses a provenance ontology (semantic provenance) to implement (a) Ontology-driven Visual Query Composer, (b) Result Explorer,and (c) Query Manager. More information
Funding: CTSC Informatics Pilot (UL1TR000439)
Clinical Data Management: Physio-MIMI
Physio-MIMI is a data integration and querying platform initially developed as part of a NCRR-funded sleep medicine project for supporting multi-center clinical studies (PIs: Redline, Zhang, 2008-10). With additional funding, Physio-MIMI is being re-factored to enable the various components of Physio-MIMI to be independently used in separate project. For example, the VISAGE is an intuitive, ontology-driven query interface to query clinical data for cohort identification and case control studies. More information
Funding: UL1RR024989
Presentations
- Awakening Clinical Data: Semantics for Scalable Medical Research Informatics Dagstuhl seminar on Semantic Data Management at the Leibniz Center for Informatics, 2012 (Slides)
- Role of Semantic Web in Health Informatics, Tutorial at 2nd ACM SIGHIT International Health Informatics Symposium 2012 (Slides)
- A Framework for Provenance Management in eScience, EECS Seminar at Case Western Reserve University, October 7, 2010 (details)
Service (selected)
- International Conference on Web Services (ICWS) 2013, Program Committee
- International Conference on Conceptual Modeling (ER) 2013, Program Committee
- Data Integration in the Life Sciences (DILS) 2013, Program Committee
- International Semantic Web Conference (ISWC) 2012, Program Committee
- American Medical Informatics Association (AMIA) Annual Symposium 2012
Semantic Web and Provenance Workshop Series
Proposed and co-organize a series of workshops exploring the research issues at the intersection of Semantic Web and Provenance Management (SWPM).


