Profile Pictureasapworks
$299

Risk Factor Disclosure Dataset v1.0

Add to cart

Risk Factor Disclosure Dataset v1.0

📦 Risk Factor Disclosure Dataset v1.0 – 1,869 Enriched Risk Disclosures from SEC Filings

A premium, AI-ready dataset of 1,788 clean, structured Item 1A: Risk Factors extracted from 10-K filings of 267 publicly traded U.S. companies between 2010–2024.

Ideal for:

LLM training and retrieval for risk, compliance, and regulatory intelligence
Trend analysis across industries and macroeconomic events
Risk classification and forward-looking disclosure modeling
Grounding GenAI agents with real corporate risk language
ESG, litigation, and geopolitical risk research


🧠 What’s Included

  • item1a_enriched.csv – Full dataset with risk category tags, forward score, summaries, and metadata
  • item1a_enriched.jsonl – JSONL version for use in AI pipelines
  • sample_100_item1a.csv – 100-record preview sample
  • README.md – Field descriptions, use cases, and schema
  • LICENSE.txt – Tiered license for individual and enterprise use

📊 Dataset Stats

MetricValueTotal Records1,788 enriched disclosuresUnique Tickers267Coverage Years2010 – 2024Avg. Risk Section Length~520 wordsForward Score Range0.00 – 0.03Forward Score Mean ± StdDev0.0157 ± 0.0066Number “Too Short” (<100w)3 (filtered out)Risk Categories (multi-label)7 (see below)


🔖 Risk Categories Covered

Each record is tagged with one or more of the following themes:

  • Regulatory
  • Litigation
  • Geopolitical
  • Technology Disruption
  • Cybersecurity
  • Climate
  • Supply Chain

Also included:

  • Forward-looking language score based on modal verbs and speculative phrasing
  • Model-generated summary (DistilBART)
  • SEC metadata (ticker, CIK, year, filing date)

🧠 Use Cases

  • Fine-tune LLMs for legal/financial risk reasoning
  • Train classifiers to detect emerging or underreported risks
  • Build GenAI copilots for compliance, ESG, or investor relations
  • Retrieve top risk disclosures by tag, score, or industry
  • Visualize disclosure evolution across 15 years

💼 License Tiers

Tier Scope Price Early Bird Single-user, no updates $299 Individual Single-user + lifetime updates $499 Enterprise Org-wide use + internal product rights$999

Refer to LICENSE.txt for details.


🎁 Sample file: Risk Factor Disclosure Sample

Link:https://drive.google.com/file/d/1rCG_lgHy9LxyhsATElTB1DfkEVf7DfQ2/view?usp=sharing


📬 Questions or enterprise licensing? Contact: Asapuaiworks@gmail.com

Add to cart

A high-quality, AI-ready dataset of 1,869 structured risk factor disclosures (Item 1A) extracted from 10-K filings of 267 U.S. public companies across 15 years (2010–2024).

Total Records
1,869 enriched disclosures
Unique Tickers
267
Coverage Years
2010 – 2024
Avg. Risk Section Length
~520 words
Forward Score Range
0.00 – 0.03
Forward Score Mean ± StdDev
0.0157 ± 0.0066
Number “Too Short” (<100w)
3 (filtered out)
Size
76 MB
Copy product URL