Senstive Data and AI

ChatGPT v API

The online ChatGPT app (both the free and paid Pro versions) are used by OpenAI to train their AI models. Sensitive data, including customer information, should never be entered into ChatGPT.

AI BUSINESS SOLUTIONS do not use ChatGPT. All our solutions access OpenAI through its API (a programming interface that queries the AI directly), which is never used to train its AI.

Offline AI Advances

Investing in a proprietary Large Language Model (LLM) is an option for organisations with considerable budgets, offering a completely secure offline operation. However, the significant costs, coupled with the need for vast computational power and the potential for lengthy response times, make it less viable for most businesses as of July 2023. Therefore, utilising accessible AI platforms like OpenAI still remains the preferred approach for the majority of organisations.

OpenAI and Data Privacy

See: https://openai.com/api-data-privacy

Data submitted through the OpenAI API after Mar 1, 2023 (unless explicitly opted in), including questions, answers and file uploads, are not used to train the OpenAI models. Data is kept for 30 days for abuse monitoring but can only be accessed if an abuse case is initiated.

Zero Data Retention

OpenAI recognises that passing sensitive data is a worry for clients and have released the following information:

"We recognize that API customers may handle sensitive information, including customer data, which is subject to data protection regulations. You can request zero data retention (ZDR) for eligible endpoints, and may be asked to meet additional requirements. When ZDR is approved, inputs and outputs will not be stored but may still be run through our safety classifiers.

"Some data sent to specific endpoints like [their audio to text transcriber AI] is not retained."

How AI BUSINESS SOLUTIONS uses sensitive data

Wherever possible, AI BUSINESS SOLUTIONS does not pass sensitive information, like customer details, through OpenAI, and instead would use a customer ID number, which can then be linked back to the customer database only when the answer is returned from OpenAI.

Example:

PROMPT: Who are my most important customers by regular spending habits and what are their worth to me?

[ANSWER FROM API]: Customer 54832 is worth £154,000; Customer 29374 is worth £99,500; Customer 92851 is worth £74,800

[QUERIES YOUR DATABASE]
RESPONSE SHOWN:
Joe Bloggs: £154,000
Jane Smith: £99,500
Fred Jones: £74,800

Only the minimum amount of data would be sent to OpenAI, so if you wanted to know which counties had the most clients, then postcodes and county data would be used. Very few postcodes in the UK are unique, so cannot be tied to any one person.

This sometimes isn't possible, especially if PDFs are uploaded - CVs for example. In these cases, it is important to fully understand the OpenAI approach to Data Privacy and be comfortable that your own Data Protection policies align with it. See https://openai.com/api-data-privacy for more information.