Epic Cogito: Reporting, Analytics, and Data Warehouse Implementation

Epic Cogito: Reporting, Analytics, and Data Warehouse Implementation Guide

Caboodle
Epic’s enterprise data warehouse – the SQL-queryable layer beneath Cogito reporting
Clarity
Epic’s relational reporting database – normalized tables extracted from the Chronicles operational database
Workbench
Epic Reporting Workbench – the end-user reporting tool built on Cogito’s data model
SlicerDicer
Epic’s self-service analytics tool – drag-and-drop cohort analysis without SQL

Epic Cogito is one of the most underspecified workstreams in Epic implementations – teams treat it as a post-go-live activity and then spend months explaining to clinical and operations leadership why they cannot answer basic questions from their own data. Cogito covers everything from end-user reporting in the Reporting Workbench through self-service analytics in SlicerDicer and direct SQL querying of Clarity and Caboodle. This article gives analysts, data architects, and implementation leads the depth to plan, configure, and validate Cogito correctly from the start.

Epic Cogito Architecture: Chronicles, Clarity, and Caboodle

Epic’s data architecture has three distinct layers. Understanding all three is the prerequisite for working effectively in Cogito analytics and reporting.

Chronicles is Epic’s proprietary operational database. It stores all clinical and administrative data in a hierarchical, item-based format. Chronicles is optimized for transactional performance – fast reads and writes during patient care. It is not designed for analytical queries. Running complex aggregate queries directly against Chronicles degrades system performance for active users. Epic explicitly prohibits direct Chronicles access for reporting purposes at most installations.

Clarity is Epic’s relational reporting database. It is a separate SQL Server database populated via an ETL (Extract, Transform, Load) process from Chronicles. Clarity extracts run on a defined schedule – typically nightly for most data, with some near-real-time extracts for high-urgency operational data. Clarity presents Chronicles data in normalized relational tables that SQL-proficient analysts can query directly. It is the foundation for Reporting Workbench reports and direct SQL reporting.

Caboodle is Epic’s enterprise data warehouse (EDW). It is built on Microsoft SQL Server (or Azure Synapse in cloud deployments) and populated from Clarity via an additional ETL layer. Caboodle presents data in dimensional model format – fact tables and dimension tables organized for analytical querying. It includes pre-built star schema structures for common analytical domains: patient encounters, orders, results, medications, and diagnoses. Caboodle also supports non-Epic data source integration, making it the preferred platform for enterprise analytics that combines Epic data with external data sources. The full Epic module landscape is available in the Epic EHR Learning Hub.

LayerTechnologyData ModelRefresh FrequencyPrimary Use
ChroniclesEpic proprietary (Cache/IRIS)Hierarchical item-basedReal-time (operational)Patient care transactions – no reporting
ClarityMicrosoft SQL ServerNormalized relational tablesNightly + near-real-time for select tablesOperational reporting, Workbench, SQL queries
CaboodleSQL Server / Azure SynapseDimensional (star schema)Nightly (full) + incrementalEnterprise analytics, EDW, BI platform feeds
Reporting WorkbenchEpic application (Clarity-based)Report templates / data modelsRuns on-demand against ClarityEnd-user operational reports within Epic

Epic Clarity: The Relational Reporting Database

Clarity contains thousands of tables organized by clinical and administrative domain. The naming convention is systematic: tables prefixed with PAT_ relate to patients, ORD_ to orders, CLB_ to billing, HSP_ to hospital records. Each table represents a Chronicles master file or record type. Understanding which Clarity table contains which clinical data element is the core competency of a Cogito reporting analyst.

Key Clarity Tables and Their Clinical Meaning

The PAT_ENC table is the central encounter record in Clarity – every patient visit, inpatient admission, and ED encounter has a row. It links to PAT_ENC_DX for diagnoses (ICD-10-CM codes), ORD_MED_ADMIN for medication administrations, and ORDER_RESULTS for lab and radiology results. A reporting analyst who understands the PAT_ENC table and its key relationships can answer most operational questions about patient volumes, length of stay, and service utilization.

The CLARITY_ADT table tracks patient movement events – admit, transfer, and discharge timestamps. It is the source for length-of-stay calculations, boarding time reporting, and readmission analysis. Build analysts and reporting teams who understand ADT workflows (covered in the context of CPOE in the Epic EHR Orders and CPOE guide) will recognize how clinical workflow decisions directly shape what appears in CLARITY_ADT.

Clarity ETL Configuration and Data Latency

The Clarity ETL runs on a schedule configured by the Epic technical team. Most organizations run the full Clarity extract nightly – typically between 2:00 AM and 6:00 AM depending on database size. During this window, some Clarity tables may be locked or returning stale data. Build analysts must communicate this latency to reporting stakeholders – a Clarity-based report run at 7:00 AM reflects data through the prior evening, not the current moment.

Near-real-time Clarity extracts exist for high-urgency operational tables – ADT data, order status, and bed management. These run on shorter cycles (15-60 minutes) and provide fresher data for operational dashboards. Build analysts must identify which reporting use cases require near-real-time data and ensure the corresponding Clarity tables are configured on the appropriate extract schedule. Using a nightly-only Clarity table as the basis for a real-time operational dashboard produces misleading metrics.

Real Scenario – Health System Quality Team, Clarity Data Latency Problem

A health system quality team implemented a sepsis bundle compliance dashboard using Clarity-based Reporting Workbench reports. The dashboard was designed to show whether sepsis patients had received antibiotics within 1 hour of sepsis recognition. After go-live, the ED medical director escalated that the dashboard was showing compliance rates that contradicted what nursing staff documented in real time. Investigation revealed that the antibiotic administration data in ORD_MED_ADMIN was on a nightly Clarity extract – not the near-real-time extract available for that table class. The dashboard was showing yesterday’s compliance data at 8:00 AM when the medical director reviewed it. The fix required switching the report to the near-real-time Clarity extract – a configuration change, not a report logic error. But the delay in discovery cost 6 weeks of misleading compliance reporting.

Epic Caboodle: The Enterprise Data Warehouse

Caboodle is Epic’s answer to the enterprise data warehouse requirement. Organizations that want to combine Epic clinical data with financial data, claims data, external benchmarks, or non-Epic system data use Caboodle as the integration layer. Caboodle’s dimensional model makes it significantly easier to write analytical queries than Clarity’s normalized structure – a query for “readmission rate by DRG for the last 12 months” requires fewer joins in Caboodle than in Clarity.

Caboodle Data Model: Facts and Dimensions

Caboodle organizes data as fact tables (measures and events) and dimension tables (descriptive attributes). The EncounterFact table is the central analytical record – it contains one row per patient encounter with keys that link to dimension tables for patient demographics (PatientDim), provider (ProviderDim), department (DepartmentDim), and diagnosis (DiagnosisDim). This star schema design means an analyst querying readmission rates by age group and primary diagnosis can join EncounterFact to PatientDim and DiagnosisDim without navigating multiple normalized tables.

Caboodle includes pre-built derived metrics – calculated fields that Epic populates as part of the ETL. Readmission flags, length-of-stay calculations, and patient risk scores are examples of metrics that Caboodle computes and stores rather than requiring each report to recompute them. Build analysts must validate these pre-built metrics against the organization’s own definitions before accepting them as reporting standards – Epic’s default readmission definition (30-day all-cause unplanned) may differ from a payer’s contracted definition or the organization’s internally adopted definition.

Caboodle on Azure: Cloud Data Warehouse Considerations

Epic increasingly supports Caboodle deployments on Azure Synapse Analytics – Microsoft’s cloud-based data warehouse service. Cloud Caboodle offers elastic compute scaling, integration with Azure Machine Learning for predictive analytics, and connectivity to Microsoft Power BI without additional ETL infrastructure. The implementation model differs from on-premises Caboodle in compute configuration, network security, HIPAA BAA requirements with Microsoft Azure, and the ETL pipeline architecture.

Organizations moving to cloud Caboodle must assess data residency requirements, HIPAA compliance configurations in Azure, and the bandwidth requirements for moving large Epic data extracts to cloud storage. Azure Synapse’s distributed query engine handles large-scale parallel queries well – a query that takes 45 seconds on an on-premises SQL Server may run in 8 seconds on Azure Synapse with the right compute configuration. But the cost model is consumption-based, and poorly optimized queries can create unexpectedly high cloud compute costs.

Reporting Workbench: Build and Configuration in Epic Cogito

Reporting Workbench (RWB) is Epic’s native reporting tool for end users. It runs within the Epic application and provides a structured interface for building, running, and sharing reports without writing SQL. RWB reports are built on data models – pre-configured views of Clarity data that expose relevant columns and define the relationships between them. A data model for medication administration reports exposes the columns from ORD_MED_ADMIN and its related tables through a user-friendly column picker rather than requiring the report author to know the underlying table structure.

Data Models: The Foundation of Reporting Workbench

Epic provides hundreds of pre-built data models covering every clinical domain. Build analysts configure which data models are available to which user groups, and can create custom data models for organization-specific reporting needs. Custom data models require knowledge of Clarity table structure and SQL – they are built by Epic reporting analysts, not by end users.

The data model configuration decision – which models to expose to which user populations – is a governance decision with security implications. A data model that exposes patient financial information should not be available to clinical staff who do not need it. A data model that exposes PHI should only be accessible to staff with HIPAA-appropriate access. Build analysts must work with the data governance team and the HIPAA privacy officer to define data model access rules before Reporting Workbench goes live.

Report Templates and the Shipped Report Library

Epic ships hundreds of pre-built Reporting Workbench report templates with every implementation. These cover quality measures (CMS IQR/OQR metrics), operational reports (patient volumes, length of stay, provider productivity), and financial reports (charge capture, denial rates, AR aging). Build analysts must evaluate which shipped reports are relevant for the organization, validate their logic against the organization’s data, and configure them with the correct department, provider, and date filter defaults.

Shipped report validation is frequently skipped – teams assume Epic’s pre-built reports are correct out of the box. They are not always. A shipped report may use a denominator definition that does not match the organization’s patient population filter requirements. A CMS quality measure report may use the national benchmark comparison that does not apply to a critical access hospital. Build analysts must validate every shipped report that will be used for external reporting or performance evaluation before it goes live.

Real Scenario – Regional Health System, Shipped Report Validation Failure

A regional health system implemented Epic Cogito and activated the shipped door-to-provider time report for CMS OQR measure OP-18 without validation. The report was presented to the ED medical director as the official OP-18 performance metric. Six months later, a CMS data submission review found that the health system’s submitted OP-18 rate differed significantly from the Reporting Workbench report. Investigation revealed the shipped report used the patient’s registration timestamp as door time, while CMS OP-18 requires the earliest of registration time or triage time as the door time definition. The shipped report had been used for internal performance management for six months with an inflated metric. Correcting the report required a custom data model modification and retroactive recalculation of 6 months of ED data.

SlicerDicer: Self-Service Analytics Configuration in Epic

SlicerDicer is Epic’s self-service analytics tool. It allows clinical and operational users to explore patient populations interactively – filtering by diagnoses, medications, procedures, demographics, and visit types without writing any queries. A hospitalist physician can use SlicerDicer to identify all patients admitted with a principal diagnosis of heart failure in the last 12 months who were readmitted within 30 days, broken down by discharge disposition. No IT involvement required.

Data Slicers: Configuration and Access Control

SlicerDicer is organized around data slicers – pre-configured analytical contexts that define which patient population and which data elements are available for exploration. The Encounters slicer allows analysis of visit-level data. The Patients slicer allows analysis of patient demographics and longitudinal health history. The Orders slicer allows analysis of ordering patterns. Build analysts configure which slicers are available to which user groups, and what population filters apply – a department head should see their department’s patients, not the entire organization’s patient population.

SlicerDicer access configuration is a patient privacy decision as much as a technical one. A slicer that allows unrestricted patient population exploration across the entire organization exposes PHI to anyone with access. Build analysts must configure population-level access controls that restrict each user’s explorable population to their clinical or administrative scope of responsibility. This requires coordination with the HIPAA privacy officer, the CISO, and department leadership.

SlicerDicer vs Reporting Workbench: When to Use Which

DimensionReporting WorkbenchSlicerDicer
Primary userReport consumers and trained buildersClinicians, quality staff, operations managers
Build requirementReport built by IT or trained analystSelf-service – end user builds analysis interactively
Output formatTabular reports, scheduled delivery, export to ExcelInteractive charts, population counts, patient lists
Best forOperational metrics, regulatory reporting, scheduled dashboardsCohort identification, ad hoc population analysis, hypothesis testing
Data freshnessClarity-based – nightly or near-real-time per tableClarity-based – same latency as Workbench
PHI exposure riskMedium – controlled by report design and accessHigh if not carefully scoped – population filter config critical
SQL skill requiredNo (for end users) / Yes (for custom data models)No

Cogito Reporting vs External Analytics Platforms

Most health systems do not rely exclusively on Epic Cogito for analytics. They connect Caboodle or Clarity to external Business Intelligence platforms – Microsoft Power BI, Tableau, Qlik, or cloud-based analytics services like AWS QuickSight. This hybrid model uses Cogito’s data infrastructure (Caboodle as the warehouse) while using more capable visualization and analytics tools for the user experience layer.

The choice between Cogito-native reporting and external BI platforms depends on the use case. Operational reports that run within Epic’s workflow – a provider’s daily patient list, a charge capture report for a department manager, a quality measure for accreditation – belong in Reporting Workbench. Enterprise analytics that combine Epic data with financial systems, HR data, or claims data belong in an external platform connected to Caboodle. Self-service exploration for clinical research or quality improvement belongs in SlicerDicer.

Integration between Caboodle and external BI platforms requires a database connection (typically a dedicated read-only SQL Server connection to Caboodle) and HIPAA-compliant data access controls. The external BI platform must be covered under a Business Associate Agreement (BAA) with the health system if it stores or processes PHI. Many organizations use a de-identification layer between Caboodle and external analytics for research use cases – Epic’s Chronicle de-identification features and the Cosmos research network facilitate this. Build analysts and data architects must design the external platform connection architecture with the HIPAA privacy and security teams before implementation.

Data Governance and Report Library Management in Cogito

Without data governance, Cogito implementations generate hundreds of duplicate and conflicting reports within months of go-live. Every department builds their own version of the same metric using slightly different filters. The CFO sees a different readmission rate than the CMO. The quality team reports a different surgical infection rate than the infection control committee. This is not a technical problem. It is a governance problem.

Metric Definitions and Canonical Report Ownership

Data governance for Cogito starts with establishing canonical metric definitions. What is the organization’s official definition of “readmission”? What is the official definition of “length of stay”? Which encounter types are included in “patient volume”? These definitions must be documented, approved by the relevant clinical and operational leadership, and codified in report logic before any report using those metrics goes live.

Each canonical report must have a named owner – the person responsible for its accuracy, update schedule, and stakeholder communication when the metric changes. In practice this is typically a clinical informatics analyst or a quality analyst, not an IT developer. The report owner is accountable for the data, not just the person who built the query.

Report Library Organization and Naming Conventions

Epic Reporting Workbench allows reports to be organized in folders by department, function, or reporting domain. Build analysts should establish a folder and naming convention before any reports are created. A naming convention that includes department prefix, metric name, and data period (e.g., “QD – ED Door-to-Provider Time – Monthly – CMS OP-18”) makes reports discoverable and reduces duplication. Without naming standards, Workbench becomes a flat list of 300 reports named “ED Report” or “Dashboard” within 6 months of go-live.

Report certification is the process of marking a report as governance-approved and production-quality. Build analysts configure a report certification workflow in Workbench that distinguishes certified reports from draft or personal reports. Users can trust certified reports to use correct metric definitions and validated data logic. Uncertified reports are clearly marked as in-development or personal use. This prevents the proliferation of unofficial reports that create metric confusion.

Security and Data Access Controls in Epic Cogito

Cogito reporting tools expose patient data at scale. A single Reporting Workbench report can return data on tens of thousands of patients. A SlicerDicer session can expose population-level patterns across an entire organization. HIPAA’s minimum necessary standard applies to reporting just as it applies to clinical care – users should have access to the minimum patient data necessary to perform their job function. Cogito security configuration must enforce this.

Row-Level Security and Population Filters

Row-level security in Clarity and Caboodle restricts which rows a user can access based on their role – a department manager can only see data for their department’s patients. A physician can only see data for their own patient panel. Row-level security is implemented through database views, stored procedures, or application-level filters in Reporting Workbench’s data model configuration. Build analysts must configure row-level security in collaboration with the HIPAA privacy officer and department leadership.

Column-level security restricts which data columns are visible in a report or data model. A report used by front-desk staff may expose appointment times and visit types but not diagnosis codes or medication lists. A report used by clinical quality staff may expose diagnosis codes and clinical documentation content. Build analysts configure column-level security in data model definitions and report templates. Missing column-level security in a data model that exposes sensitive PHI to the wrong user population is a HIPAA compliance failure. The clinical documentation workflows that generate the data flowing into Cogito are described in the EpicCare Inpatient ClinDoc guide.

Audit Logging for Reporting Access

HIPAA requires audit logging of PHI access. Reporting Workbench access is logged in Epic’s audit trail – every report run, every patient record accessed through a report, and every SlicerDicer session generates an audit entry. Build analysts must ensure the audit logging configuration captures the required information and that the logs are retained for the HIPAA-required minimum period (6 years for covered entities). The HIPAA privacy officer must review and approve the audit logging configuration before Cogito reporting goes live.

Testing and Validation Strategy for Epic Cogito

Cogito testing differs from clinical module testing in one critical dimension: there is no patient safety threshold. But there is an analytics quality threshold. An inaccurate clinical report used for performance management, accreditation, or regulatory submission can have serious institutional consequences – reputational damage, incorrect strategic decisions, failed accreditation surveys, and CMS payment penalties.

Data Validation: Reconciling Reports Against Source Data

Every Cogito report must be validated against a known source of truth before it is used for decision-making. For a patient volume report, the validation compares the report’s patient count against the ADT system’s admit count for the same period. For a medication administration report, the validation compares the report’s results against a sample of manually reviewed medication administration records in the clinical application.

Parallel reporting – running the Cogito report simultaneously with the legacy system report for the same metric and comparing results – is the gold standard validation for reports that replace legacy system reports. The parallel period should be long enough to cover at least one full reporting cycle (typically 30-90 days). Discrepancies identified during parallel reporting must be investigated and resolved before the Cogito report replaces the legacy report as the authoritative source. Teams familiar with BAT vs UAT methodology will recognize parallel reporting as a form of acceptance testing for analytics – the clinical user is the acceptance authority, not the IT team.

ETL Validation and Data Completeness Testing

The Clarity and Caboodle ETL processes must be validated after initial configuration and after any significant Epic upgrade. ETL validation confirms that records from Chronicles arrive in Clarity/Caboodle without data loss, incorrect transformation, or truncation. A typical ETL validation checks record counts (are the same number of encounters in Chronicles and Clarity?), field completeness (are all expected fields populated?), and value accuracy (do sample record field values match between Chronicles and Clarity?).

After Epic upgrades, new Chronicles data elements introduced by the upgrade may not immediately appear in Clarity or Caboodle. Build analysts must check the upgrade’s release notes for new data elements, confirm whether Clarity/Caboodle ETL mappings were updated, and validate that any reports depending on new data elements receive the expected data after the upgrade is applied to Clarity.

Go-Live Planning and Common Cogito Implementation Failures

Cogito go-live is often treated as a secondary workstream that follows clinical go-live. This is the wrong model. Critical operational metrics – daily census, ED throughput, OR utilization, pharmacy charge capture – need to be available and validated from day one of clinical go-live. If Cogito reporting is not ready when clinical operations start in Epic, operations leaders have no data visibility during the highest-risk period of the implementation. The Epic EHR Go-Live Support framework covers the command center model – add a Cogito reporting analyst with access to real-time Clarity data to the go-live command center structure.

Failure PointImpactMitigation
Clarity ETL hold timer not disabledHigh – stale data in all Cogito reportsETL configuration audit before go-live
Shipped report using wrong metric definitionHigh – inaccurate CMS/regulatory reportingValidate every shipped report against metric specification
SlicerDicer population filter too broadHigh – HIPAA PHI exposurePrivacy officer review of all SlicerDicer slicer access configs
No data governance – metric proliferationMedium – conflicting metrics, leadership distrustCanonical metric definitions and report certification before go-live
Caboodle pre-built metric definition mismatchMedium – wrong readmission/LOS calculationsValidate all Caboodle derived metrics against org definitions
Near-real-time extract not configuredMedium – operational dashboards show stale dataExtract schedule reviewed per reporting use case latency requirement
ETL fails silently after upgradeMedium – new data elements missing from reportsPost-upgrade ETL validation checklist – run after every Epic upgrade

Roles, Certifications, and Career Path for Epic Cogito Specialists

Cogito Reporting Analyst
Builds Reporting Workbench reports, configures data models, validates shipped reports, manages report library. Requires Epic Cogito certification. Works with clinical operations and quality teams.
Clarity / Caboodle Data Analyst
Writes SQL queries against Clarity and Caboodle, develops custom data models, supports external BI platform integration. Requires Epic Cogito certification plus SQL proficiency and data warehouse experience.
Data Architect
Designs Caboodle EDW architecture, manages ETL configuration, integrates non-Epic data sources, designs external BI platform connections. Senior role requiring both Epic Cogito knowledge and enterprise data warehouse expertise.
Clinical Informatics Analyst
Bridges clinical operations and data analytics. Owns canonical metric definitions, validates report logic against clinical intent, supports data governance committee. Often holds both clinical and analytics credentials.
RoleCertificationKey SkillsSalary Range (2026)
Cogito Reporting AnalystEpic CogitoWorkbench, data models, report validation, governance$80,000 – $115,000
Clarity / Caboodle AnalystEpic CogitoSQL, Clarity tables, Caboodle star schema, ETL$90,000 – $130,000
Data Architect / EDW LeadEpic Cogito + Azure/DWH certCaboodle architecture, Azure Synapse, non-Epic integration$115,000 – $155,000
Clinical Informatics AnalystEpic Cogito + clinical backgroundMetric governance, CMS quality measures, BI platforms$100,000 – $140,000
Cogito Consultant (Contract)Epic CogitoClarity/Caboodle, Workbench, governance, 2+ implementations$75 – $120+/hr
The Cogito Principle That Prevents the Most Problems

Validate every shipped report against the actual metric specification before it goes live – not after leadership asks why the numbers are wrong. Epic’s pre-built reports are starting points, not finished products. A report named “CMS OP-18 Door-to-Provider Time” still requires that you verify the door time source, the provider contact timestamp, and the included patient population match CMS’s published measure specification exactly. Ten hours of validation work before go-live prevents six months of incorrect performance data after it.

Authoritative References

Downloads: Epic Cogito Templates and Checklists

📋
Epic Cogito Go-Live Readiness Checklist (PDF)
Pre-go-live gates for Clarity/Caboodle ETL validation, shipped report review, data model access controls, SlicerDicer population filters, HIPAA audit logging, metric definitions, report certification, and parallel reporting confirmation.

Download Checklist (PDF)

📊
Cogito Report Library and Metric Definition Tracker (Excel)
Track every report: report name, metric definition, data model used, Clarity table source, owner, certification status, stakeholders, validation method, parallel period results, and scheduled delivery configuration.

Download Tracker (Excel)

🧪
Cogito ETL and Report Validation Test Script (Excel)
Structured test cases: Clarity ETL record count validation, Caboodle derived metric verification, Workbench report data accuracy, SlicerDicer access control, HIPAA audit log confirmation, and parallel reporting reconciliation.

Download Test Script (Excel)

Scroll to Top