Frequently Asked Questions

Common questions about the Training Data Registry.

Is registering my content legally binding?

Registration creates a dated record of your AI training preferences. While we cannot provide legal advice, this documentation may help demonstrate that you expressed your preferences on a specific date. The legal weight depends on your jurisdiction and circumstances. For EU users, we also offer meta tags aligned with the EU TDM Directive.

How does domain verification work?

You add a meta tag to your website's HTML. When you click 'Verify', our server fetches your homepage and checks for the tag in the HTML response. The tag must be rendered server-side (not injected by JavaScript). This works with WordPress, static sites, and server-rendered frameworks like Next.js. Verified registrations carry significantly more weight than unverified claims.

Why isn't my domain verification working?

Common issues: (1) The meta tag must be in the server-rendered HTML, not added by JavaScript after page load. View your page source (Ctrl+U) to confirm. (2) Your site must be accessible over HTTPS. (3) The verification code must match exactly. (4) If you just deployed, wait a minute for caches to clear. (5) Some hosting platforms may block our verification bot - contact support if issues persist.

Can I register content I don't own?

No. You should only register content you own or have authority to register. Fraudulent registrations may be removed, and repeated abuse will result in account termination.

How do AI-use controls work?

You set three independent permissions for your content: Training (whether AI models can learn from it), Inference (whether AI can reference it to answer queries), and Archive (whether it can be stored long-term in datasets). By default, all three are off — no AI use permitted. You can selectively enable specific uses. The Training permission is available to all registered users. Inference and Archive permissions require domain verification, which prevents bad actors from opting competitors' content out of AI use.

How do AI companies check the registry?

The public registry is searchable at trainingdataregistry.org/search. We are also building an API for automated checking — AI companies can join the waitlist at our For AI Companies page.

Is the registry free?

Basic registration is free for individual creators. We're developing paid tiers for publishers with large catalogs and for AI companies needing API access.

What if an AI company ignores my registration?

The registry provides documentation, not enforcement. However, documented opt-outs may strengthen your legal position if you need to take action. We're also working with responsible AI companies to integrate registry checking into their workflows.

Can I change my preferences later?

Yes. You can update your AI-use permissions at any time. The registry maintains a history of your preferences with timestamps, which can be useful for establishing when changes were made.

How can I independently verify my timestamp?

Every domain verification and URL registration from a verified domain owner is included in a daily Merkle tree. The root hash is committed to a public GitHub repository, creating an independent, tamper-proof record. You can view the public commits at github.com/trainingdataregistry/timestamp-proofs. To verify your specific record, request a Merkle proof via the API — it includes the leaf hash, proof path, and the GitHub commit link so you can confirm the root hash matches.

What's the difference between this and robots.txt?

Robots.txt is a crawling directive that lacks verification or a dated record. This registry provides verified, dated documentation of your preferences with granular AI-use controls and a searchable public registry.

Do you store my content?

No. We only store the URL, your preferences, and metadata. We never store, host, or cache your actual content.

Still have questions?

We're here to help.

Contact Support