Every time you type a prompt into an AI tool, you are potentially sharing data with a third party. For personal use, this might mean your casual questions become training data. For business use, it could mean proprietary information, customer data, or trade secrets end up in a model’s training set. Understanding what happens to your data when you use AI tools is no longer optional—it is a fundamental requirement for responsible AI adoption.

This guide provides a practical overview of AI privacy: what major providers do with your data, how to configure opt-out settings, when to consider self-hosted alternatives, the differences between enterprise and personal-tier privacy protections, and how GDPR and other regulations affect AI tool usage in practice.

Data Policies by Major AI Provider

Understanding each provider’s data policy is the first step in making informed privacy decisions. Policies differ significantly across providers and even across tiers within the same provider.

OpenAI (ChatGPT). On the free and Plus tiers, OpenAI may use your conversations to train and improve its models, unless you opt out. On the Team, Enterprise, and API tiers, OpenAI does not train on your data by default. Users on any tier can disable training in Settings under Data Controls. OpenAI retains conversations for 30 days for abuse monitoring, even when training is disabled. Temporary Chat mode creates conversations that are not saved to your history or used for training.

Anthropic (Claude). Anthropic’s free and Pro tiers may use conversations for model improvement, with opt-out available. The Team and Enterprise tiers do not use customer data for training. Anthropic has published detailed data handling practices and allows users to request deletion of their data. Claude’s constitutional AI approach includes privacy as a design consideration, but the practical data handling depends on tier and settings.

Google (Gemini). Google’s Gemini uses conversations from free-tier users for training, with opt-out available in Google’s activity settings. Workspace customers (Gemini for Google Workspace) get stronger privacy protections: Google does not use Workspace data for model training. However, Google may retain prompts and responses temporarily for operational purposes. The Google Workspace data processing amendment provides contractual privacy commitments for enterprise customers.

Microsoft (Copilot). Microsoft’s consumer Copilot may use conversations for improvement, with opt-out available. Copilot for Microsoft 365 (enterprise) does not use customer data for training and inherits the privacy protections of the Microsoft 365 Enterprise data processing agreement. Data stays within the Microsoft 365 compliance boundary.

The pattern. Across all major providers, the rule of thumb is: free and consumer tiers train on your data by default (with opt-out options); enterprise tiers do not. If privacy is critical, use enterprise tiers or APIs with clear data-processing agreements.

How to Configure Opt-Out Settings

Every major AI provider offers opt-out mechanisms, but they are not always easy to find or well-explained. Here is how to configure the most important privacy settings for each provider.

ChatGPT opt-out. Go to Settings > Data Controls > “Improve the model for everyone.” Toggle this off. This prevents your conversations from being used for training. Note that this also disables conversation history—you lose the ability to access past conversations. If you want history without training, use ChatGPT Team or Enterprise. Alternatively, use the API with the data usage opt-out, which never trains on API inputs by default.

Claude opt-out. In Claude’s settings, you can opt out of having your conversations used for model training. Anthropic also provides an email-based data deletion request process for users who want their data removed entirely. For organization accounts, admins can set data-usage policies that apply to all members.

Gemini opt-out. Go to your Google Account > Data and Privacy > Gemini Apps Activity. You can pause this activity, which prevents new conversations from being saved and used for training. You can also delete past Gemini activity from this page. For Workspace users, the admin console provides organization-wide controls.

General best practices. Opt out of training on every AI tool you use regularly. Review your privacy settings after every major update—providers sometimes reset settings or add new data-sharing options that default to on. Use separate accounts for personal and professional AI use so that personal conversations do not leak into professional contexts and vice versa.

Self-Hosted AI: When and How to Run Your Own Models

For organizations with strict privacy requirements, self-hosted AI models offer the strongest data protection because data never leaves your infrastructure. The trade-off is higher operational complexity and often lower model capability.

When self-hosting makes sense. Consider self-hosting when: you handle highly regulated data (health records, financial data, classified information) that cannot leave your network, even temporarily; your industry regulations prohibit sending data to third-party AI providers; you need complete audit trails of all AI interactions; or you want to fine-tune models on proprietary data without sharing that data with anyone.

Open-source model options. The open-source AI ecosystem has matured significantly. Models like Llama 3 (Meta), Mistral, Qwen (Alibaba), and Gemma (Google) can run on-premises and deliver performance that approaches commercial APIs for many use cases. For code generation, Code Llama and StarCoder are strong options. For embeddings and retrieval, open-source models from Sentence Transformers and Nomic are production-ready.

Infrastructure requirements. Running large language models requires GPU hardware. For inference only (no training), a single server with one or two NVIDIA A100 or H100 GPUs can serve a 70B-parameter model for a small to medium team. Quantized models (4-bit or 8-bit) reduce hardware requirements significantly—a 70B model quantized to 4-bit can run on a single 24GB GPU with acceptable quality for many tasks. Tools like Ollama, vLLM, and llama.cpp make local deployment increasingly accessible.

Hybrid approaches. Many organizations use a hybrid approach: self-hosted models for sensitive data and commercial APIs for non-sensitive tasks. This lets you benefit from the superior capabilities of frontier models (GPT-4, Claude, Gemini) for general work while keeping sensitive data on-premises. Implement a data classification policy that defines which data categories can use external APIs and which must use self-hosted models.

Enterprise vs. Personal Tier Privacy Protections

The privacy gap between consumer and enterprise AI tiers is significant. If you are using AI for anything beyond purely personal tasks, understanding this gap is essential.

Data processing agreements. Enterprise tiers come with data processing agreements (DPAs) that contractually bind the provider to specific data handling practices. Consumer tiers are governed by terms of service, which give the provider much more latitude. A DPA specifies exactly how data is stored, processed, retained, and deleted—and gives you legal recourse if the provider violates those commitments.

Training data exclusion. Enterprise tiers uniformly exclude customer data from model training. Consumer tiers include data in training by default, with opt-out options. This is the single most important privacy difference. If you input proprietary information, customer data, or trade secrets into a consumer-tier AI tool without opting out of training, that information may influence future model outputs for other users.

Access controls and audit logs. Enterprise tiers provide admin controls over who can use the AI tool, what data they can input, and how outputs are monitored. They also provide audit logs that track all interactions—essential for compliance. Consumer tiers typically have no admin controls and limited audit capabilities.

Compliance certifications. Enterprise AI products often carry compliance certifications (SOC 2, ISO 27001, HIPAA BAA) that consumer products do not. These certifications verify that the provider’s security practices meet specific standards. If your organization has compliance requirements, enterprise tiers are not optional—they are necessary.

The business case for enterprise tiers. Enterprise AI subscriptions are more expensive, but the cost of a data breach or compliance violation is orders of magnitude higher. For any organization handling customer data, regulated data, or proprietary information, the enterprise tier is the minimum viable option for responsible AI usage.

GDPR, CCPA, and Regulatory Compliance

AI tool usage intersects with data protection regulations in ways that many organizations have not fully addressed. Here is what you need to know about the major frameworks.

GDPR (EU/EEA). If you process EU residents’ personal data through an AI tool, GDPR applies. Key requirements: you need a lawful basis for processing (typically consent or legitimate interest), you must inform users that their data may be processed by AI, you must ensure the AI provider has adequate safeguards for cross-border data transfers, and you must respond to data subject access and deletion requests. GDPR’s right to explanation may also apply when AI is used for automated decision-making that significantly affects individuals.

CCPA/CPRA (California). Similar to GDPR but focused on California residents. Key requirements: disclose what personal information is collected and how it is used (including AI processing), provide an opt-out mechanism for data sharing with AI providers, and respond to consumer requests to know, delete, or correct their personal information.

EU AI Act. The EU AI Act, which is phasing in through 2026, introduces risk-based regulation of AI systems. High-risk AI systems (those used for hiring, credit scoring, law enforcement, etc.) face strict requirements for transparency, human oversight, and data quality. Even general-purpose AI models must meet transparency requirements, including disclosing training data summaries.

Practical compliance steps. For most organizations, compliance means: (1) inventorying all AI tools in use, (2) classifying what data flows through each tool, (3) ensuring enterprise-tier agreements or adequate DPAs are in place, (4) updating privacy notices to mention AI processing, (5) implementing data minimization practices (only send data to AI tools that is necessary for the task), and (6) training employees on what data can and cannot be shared with AI tools.

A Practical AI Privacy Checklist

Whether you are an individual user or managing AI usage for an organization, this checklist covers the essential privacy actions you should take.

For individual users:

  • Opt out of training on every AI tool you use regularly
  • Never input passwords, API keys, or financial account numbers into AI tools
  • Use separate accounts for personal and professional AI use
  • Avoid inputting other people’s personal data (names, emails, phone numbers) into AI tools
  • Review and delete your AI conversation history periodically
  • Read the privacy policy of any new AI tool before using it
  • Use browser-based tools in incognito or private mode when handling sensitive topics

For organizations:

  • Create an approved list of AI tools with documented data policies
  • Require enterprise tiers for any tool handling business or customer data
  • Implement a data classification policy that defines what data can be used with which AI tools
  • Train all employees on AI privacy policies and acceptable use
  • Conduct quarterly audits of AI tool usage and data flows
  • Ensure DPAs are in place with all AI providers
  • Appoint an AI privacy lead (this could be an extension of your existing data protection officer’s role)
  • Test incident-response procedures for AI-related data breaches

Conclusion

AI privacy is not about avoiding AI tools—it is about using them with your eyes open. Every AI tool involves a privacy trade-off: you share data in exchange for capability. The goal is to make that trade-off consciously, with full awareness of what data you are sharing, who has access to it, and what protections are in place.

The practical steps are straightforward: opt out of training, use enterprise tiers for business data, classify your data before sending it to any AI tool, and stay informed about the evolving regulatory landscape. AI is too valuable to avoid, but too powerful to use carelessly.

Frequently Asked Questions

Can AI companies read my conversations?

Technically, yes. AI providers can access your conversations for abuse monitoring, safety review, and in some cases model training. Enterprise tiers have stricter access controls and contractual limitations on who can access data and under what circumstances. The key is to treat AI conversations as you would any cloud service: assume the provider has theoretical access and do not input data you would not share with a trusted third party.

Does opting out of training actually protect my data?

It prevents your data from being used to improve the model, which is the most significant privacy concern. However, the provider may still retain your data temporarily for abuse monitoring and service operation. Opting out reduces your privacy exposure significantly but does not eliminate it. For maximum protection, use enterprise tiers with DPAs or self-hosted models.

Is it safe to use AI tools for work?

It can be, with appropriate safeguards. Use enterprise-tier products with DPAs for any business data. Never input highly sensitive data (trade secrets, customer PII, passwords) into consumer-tier AI tools. Implement a data classification policy so employees know what they can and cannot share. With these guardrails, AI tools can be used safely for the vast majority of business tasks.

What happens to my data if an AI company is acquired or goes bankrupt?

This depends on the company’s privacy policy and any DPAs in place. Most privacy policies include provisions for data handling during corporate changes, but enforcement varies. Enterprise DPAs typically include stronger protections, including data deletion requirements. For maximum protection, minimize the data you share with any single AI provider and maintain the ability to delete your data at any time.

Larry Meiswell
Senior Technology Analyst, Dat4
Larry Meiswell is a senior technology analyst at Dat4, covering enterprise software, AI infrastructure, and digital marketing technology. With over a decade in B2B tech journalism, Larry specializes in translating complex vendor landscapes into actionable intelligence for decision-makers.