AI & Governance

Training Data

The dataset used to train an AI model, which directly shapes the model's capabilities, biases, and limitations. Training data quality and provenance are increasingly under scrutiny -- copyright lawsuits (New York Times v. OpenAI), regulatory requirements (EU AI Act training data summaries), and certification programs (Fairly Trained) all focus on whether training data was properly sourced and documented.