Dataset Health Dashboard Overview
-
This dashboard evaluates your dataset across multiple quality dimensions:
Overall Health Score
- A 0–100 rating that summarizes your dataset’s quality
- Based on completeness, uniqueness, and format consistency
Basic Metrics
- Row Count: Total number of instruction records
- Column Count: Number of fields (e.g., prompt, output)
- File Size: Disk space occupied by the dataset
- Memory Usage: RAM estimated for processing/training
Quality Metrics
- Missing Data: % of empty or null values in required columns
- Duplicate Entries: % of identical rows (may cause model bias)
- Format Consistency: Checks for structural uniformity across rows
- Column Structure: Shows each column and its data type (e.g., text, number, boolean)
Last updated on