The Blueprint - Part II: Knowledge Management
Taking an agile approach to implementing generative AI
Hello and welcome to part two of The Blueprint series where I’ll be exploring where to start when building a strategy for implementing generative AI in business.
In my previous article I covered some important topics to help business leaders get started with their strategy for generative AI. If you haven't read that yet, I encourage you to take a look here.
In this second part, we'll shift our focus on to how businesses can harness the potential of their knowledge, which will be foundational for any successful generative AI strategy. We’re going to delve into the challenges of making knowledge accessible for generative AI, how we can manage data that is fast-moving, maintaining confidentiality and security, and organising and labelling knowledge for optimal usage.
Business Knowledge
A business’ knowledge is typically spread across many different formats and locations. For example:
Emails (Outlook, Gmail etc.)
Chat channels (Slack, Teams etc.)
Spreadsheets (Excel, Google Sheets etc.)
Written documents (Word, PDF etc.)
Presentations (Powerpoint, Google Slides etc.)
Databases (Customer/supplier data, Financial data, Employee data etc.)
There are many different challenges to work through in making this knowledge accessible to a generative AI model, but the first obvious question is why would we want to make all this knowledge accessible to a generative AI model in the first place? For me there are five big reasons:
Knowledge Discovery: A business generates and accumulates vast amounts of data over time, which is generally inaccessible. A generative AI model can address this, collecting and sifting through the data to identify trends, insights, and valuable information to enhance decision making.
Efficiency: Generative AI can automate routine tasks such as sorting through emails, handling customer enquiries, managing databases, or even creating drafts for presentations or documents. This would free up time for employees to engage in higher value tasks.
Customer Interaction: Generative AI can help improve customer and supplier interactions by providing timely responses and personalised experiences based on the knowledge a business has on them. It could do this 24/7, providing real-time help and support.
Data Management: Generative AI can categorise and store information in a structured manner, making it easy to retrieve when needed.
Collaboration: Generative AI can improve collaboration by helping to manage and coordinate communication across various platforms. For instance, it could summarise key points from a Slack conversation, an email thread, or a shared document.
Generative AI… could summarise key points from a Slack conversation, an email thread, or a shared document.
Knowledge Challenges
So, what are the challenges in making the business’ knowledge available to a generative AI model?
Data Confidentiality & Security
Data confidentiality is a big concern. Businesses have legal and ethical obligations to protect sensitive information and giving a generative AI model access to business knowledge can introduce new risks. For example:
Financial & supplier data - this is usually restricted within an organisation, with only specific teams or individuals having access. We need to ensure that these access restrictions remain in place and that financial data is safeguarded in a generative AI model.
Customer data - this is often subject to strict privacy laws, such as the GDPR in Europe, so must be handled with care and comply with the local regulations. There are also some unknowns here - do we need additional consents to share a customer’s data with a generative AI model? How do we remove a customer’s data from a generative AI model if we receive a ‘right to be forgotten’ request?
Employee data - similar to customer data, giving a generative AI model access to employee data should be done in a manner that respects privacy rights and adheres to local employment laws. Also, employees should be clearly informed about what data is being used, why, and how it's being protected. It's also essential to consider the ethics of using this data, particularly if it's used in ways that could impact employee evaluations or job security.
Email/chat data - this data can contain sensitive business information, personal data, or confidential correspondences. A generative AI model accessing this data should have robust privacy safeguards. For example, the model can be designed to use natural language processing to redact sensitive information automatically. Lastly, a clear usage policy should be communicated to all employees, customers and suppliers so they understand the nature and extent of the AI's access to their communications.
Data Fluidity
Put simply, some data is more fluid than others. What I mean by data fluidity is the frequency and speed at which it is changed or updated. This has important implications on how we train/give generative AI access to it.
Data that is static is much simpler to manage and give generative AI access to. We have options - we could use it to pre-train a model, fine-tune a model, create embeddings, give a generative AI model direct access to it or use it as part of prompt engineering.
I’ll be covering all of these options in more depth in the next part of The Blueprint series as well as recommending the best approach for different types of data.
However, data that is highly fluid is much more complicated and we have fewer options when it comes to generative AI. With the current technology it is too expensive, time consuming and bad for the environment to be constantly re-training or fine-tuning a generative AI model with highly fluid data. This might change in the future as hardware costs come down and smaller models become more adept but I suspect that we will be living with this restraint on highly fluid data for some time.
Knowledge Management
I believe knowledge management in businesses will become an important, distinct, and highly-skilled discipline in the near future. I like the idea of every business having a Knowledge Library that is maintained and curated by Knowledge Librarians, whose job it will be to ensure that all the knowledge in a business is readily available for generative AI models. This will undoubtedly be something that becomes more automated and streamlined over time, but until then Knowledge Librarians will need to catalogue and integrate knowledge across the business to use with generative AI models.
This skillset is something I’ll cover in more depth in a future instalment of The Blueprint.
Below are the important areas that I believe knowledge librarians will be responsible for and will need to address when a business starts to build out their knowledge library:
Knowledge Integration: Knowledge in a business is typically spread across many different systems and formats. Integrating this data in a central repository that can be processed by a generative AI model can be a significant challenge, but having a modern business technology stack can greatly help with this. I’ll have a whole article dedicated to this topic in a future instalment of The Blueprint.
Knowledge Transformation & Harmonisation: As part of integrating knowledge into a central repository, it will need to be transformed into unified formats that are easily accessed by generative AI models.
Knowledge Quality: The effectiveness of a generative AI model depends greatly on the quality of the data it's given. Inconsistent, incomplete, or erroneous data can compromise the model's performance and lead to inaccurate outputs. Therefore, it's essential to have robust data cleaning and validation processes.
Knowledge Labelling: It is useful for data to have Metadata appended to describe the content, location, quality, and other characteristics of each data set. This will also help if the data is being used by a supervised learning model.
Knowledge Governance: Implementing a robust governance strategy will ensure that knowledge is consistently classified, stored, and managed across the business. Clear policies and procedures for knowledge handling will need to be established.
Maintenance and Upkeep: Generative AI models are not set-and-forget tools. They require ongoing maintenance, updating, and validation to ensure they continue to perform effectively and don't 'drift' from their intended purpose.
Summary
In this article we’ve covered the importance of knowledge management, which forms an essential part of any generative AI strategy. Knowledge is the foundation that any generative AI system will be built on so getting it right is vital.
In the next article in The Blueprint series, I’ll be looking at the different training methods for generative AI, other options for giving generative AI access to your business’ knowledge and how we can tackle the knowledge challenges around data confidentiality, security, and fluidity.
"The future is already here, it's just not evenly distributed."
William Gibson
This article was researched and written with help from ChatGPT, but was lovingly reviewed, edited and fine-tuned by a human.