It’s been hard to miss coverage of OpenAI’s recent release of ChatGPT. Estimates suggest over 1 million registrations in the first week. ChatGPT expresses AI using a question-and-answer format – you ask a question, ChatGPT responds. The AI tool allows follow-up questions and understands a wide domain of human experience. We wanted to see if the AI tool could suggest steps to develop a data strategy for a retailer.
Our first question was aimed at priming the AI tool with a general request:
How should a retailer develop a data strategy?
ChatGPT’s first response:
A retailer should develop a data strategy by first identifying the key business goals and objectives that the data will support. This will help to ensure that the data collection and analysis efforts are focused on providing the most value to the business.
The retailer should then determine what data needs to be collected, how it will be collected, and how it will be stored and managed. It is also important to establish clear processes for analyzing and interpreting the data, and for using the insights gained to inform business decisions and drive business growth.
In addition, the retailer should establish metrics for measuring the success of the data strategy and regularly review and assess the effectiveness of the strategy to ensure that it continues to support the business’s goals and objectives.
It’s important to note that the text above (and the two further sets that follow) is unedited. This is the exact text produced by ChatGPT. The top-level nature of the question yields some predictably top-level responses. The response is well-written, easy to digest and follows a logical path. First impressions, this reads like natural copy, not machine generated. Easy to argue that the results are somewhat superficial, but again, this is a machine creating copy.
Our second question, trying to get a little more detail, and focusing in on data quality:
How would a retailer assess the quality of their customer data?
ChatGPT’s second response:
There are a few key factors that a retailer can consider when assessing the quality of their customer data. First, the retailer should ensure that their data is accurate and up to date. This means checking for errors or inconsistencies in the data, and regularly updating the data to reflect any changes in customer information.
Next, the retailer should consider the completeness of their data. This means ensuring that they have sufficient information about each customer, such as their contact details, purchase history, and preferences. Having a complete and accurate picture of each customer will enable the retailer to provide more personalized and relevant experiences.
The retailer should also consider the reliability of their data. This means ensuring that the data is consistently collected and stored, and that it is not subject to external factors that could affect its accuracy. For example, if the data is collected through surveys, the retailer should ensure that the survey questions are clear and unbiased, and that the responses are representative of the entire customer base.
Finally, the retailer should consider the accessibility of their data. This means ensuring that the data can be easily accessed and used by the appropriate teams within the organization, and that it is organized and structured in a way that enables effective analysis and decision making.
As the question gets a little more specific, so the response also gets more specific. For example, in the second paragraph, ChatGPT gives examples about customer data attributes including contact details, purchase history and preferences. The AI tool figures out that customers have purchase histories, which is intuitive to us, but pretty smart for a machine. Further, the example that ChatGPT gives in the third paragraph, identifying real risk in survey data is valid and helpful.
Our third and final question: A multi-step question with two establishing sentences before the actual question. How will the AI tool handle that? Note also an error presented in the question – ‘gas’ placed out of context (should be ‘has’):
A retailer gas collected customer data over several years. Many of the customers will have moved to new locations. What key tasks should the retailer now take to clean their customer data?
ChatGPT’s third response:
There are a few key tasks that a retailer should take to clean their customer data if many of the customers have moved to new locations. First, the retailer should identify which customers have moved and update their records with the new addresses. This can be done by cross-referencing the customer data with external sources, such as the national postal service or public records, to determine if the customer’s address has changed.
Next, the retailer should verify the accuracy of the new addresses by contacting the customers directly, either through email, phone, or mail. This will help to ensure that the updated addresses are correct, and that the retailer has the most up-to-date information about their customers’ locations.
Once the customer data has been updated with the new addresses, the retailer should also take steps to remove any duplicate or outdated records. This will help to ensure that the data is clean and accurate, and that the retailer is not wasting resources on outdated or irrelevant information.
Finally, the retailer should consider implementing a regular process for cleaning and updating their customer data. This can involve regularly checking the data against external sources, verifying the accuracy of the information, and removing any outdated or duplicate records. This will help to keep the customer data clean and accurate over time.
ChatGPT has clearly keyed in on ‘new locations’ as a key fact in the question text and has formed the response largely around that. Using third party sources to validate data, checking with customers, are all valid responses (though we are perhaps a little light on specifics here). Towards the end of the response, we see some duplication creeping into the response where very similar phrases are used in paragraphs three and four. The text is still readable and appropriate, but there is clear duplication. Perhaps the compound nature of the question has led to some confusion.
Overall, the response quality is high, and the English prose is well written. Perhaps a human writer might have given a third-party source to validate a few statements. For example, using the survey data, might a human writer have referenced a url listing common issues with survey data? Perhaps. But overall, the writing is of a high quality and not machine-like. As the AI tool gains feedback, the quality likely will continue to improve.
A final thought on ChatGPT, the user experience of registering to use the tool, asking questions and receiving responses is clear and accessible. The complexity of the AI processes is hidden from the user experience, letting users focus on forming their request rather than figuring out how to manage the request process.
The following links may be of interest to further explore this area of digital innovation:
- MIT Technology Review has a good AI section with lots of interesting articles https://www.technologyreview.com/topic/artificial-intelligence/
- In particular, this recent Technology Review article explores some issues around bias in AI using a startling personal account of using an image AI tool https://www.technologyreview.com/2022/12/12/1064751/the-viral-ai-avatar-app-lensa-undressed-me-without-my-consent/
- Episode 757, The Ghost in the Machine, of This American Life podcast https://www.thisamericanlife.org/757/the-ghost-in-the-machine contains a section entitled Ghostwriter where an author experiments with an earlier version of ChatCPT to interesting effect.