<code/constitution>/ kaggle

Every Kaggle dataset — verified.

Compliance-to-architecture coverage for Kaggle datasets: license declared, license-compatibility, PII screening, source attribution. Same engine, same crosswalk graph as /hf — different registry parser.

Powered by the same Code Constitution engine — extended to Kaggle's dataset registry via the kaggle.datasets adapter.

Dataset-card findings trend

Sample distribution of fail / warn / ok across 12 evaluation windows. Real per-tenant numbers in the customer dashboard.

Kaggle dataset-card · 12 evaluation windows

failwarnok
W1
W2
W3
W4
W5
W6
W7
W8
W9
W10
W11
W12

Framework coverage

EU_AI_ACT
EU AI Act Art. 10
Training-data origin + lawful basis when a dataset is consumed by a high-risk AI system
GDPR
GDPR Arts. 6, 9, 30
Lawful basis for processing + special-category-data screening + record-of-processing entry
HIPAA
HIPAA §164.514
De-identification status when a dataset contains PHI
ISO_42001
ISO/IEC 42001 8.3
Data-governance + lineage documentation
NIST_AI_RMF
NIST AI RMF MEASURE-2.3
Training-data evaluation evidence

Checks shipped on day one

1
License declared + machine-readable
Kaggle dataset metadata `licenseName` populated; CC-0 / CC-BY / CC-BY-SA / etc.
2
License compatibility with downstream use
Detect commercial-use-permitted vs research-only constraints
3
Subtitle + description present
Minimum descriptive text per AI Act Art. 13 transparency
4
No raw PII without de-identification claim
HIPAA §164.514 / GDPR Art. 9 — flag datasets that match special-category patterns without explicit de-identification claim
5
Source attribution
Owner / collector identity disclosed (FATF 24 / AI Act Art. 25)
6
Privacy status
Public / private flag accurate