Generative AI: A New World of Risk
Open AI is now over 8 years old and worth around $80 billion following its recent investment from Thrive Capital. It was originally created by Sam Altman and Elon Musk to apparently challenge the dominance of Google. Despite the length of its gestation period ($10 billion a year capital growth is still pretty impressive) it only arrived on the world scene just over a year ago.
When Generative AI first hit the market many people started looking at the risks. They were right to do so and many of these were potentially worrying. These included hallucinations, potential breaches of copyright, no awareness of where information had come from, the bias of the internet (much of the internet’s content being developed by the Western World), attributing certain jobs to certain sexes, the fact various countries banned it and that private information could be consumed by the model. The list was fairly endless albeit some of these points have been addressed to an extent by private LLMs and models such as Microsoft Copilot and Azure AI.
As things are progressing, risk management of AI (and Gen AI) is becoming more sophisticated. It is a key area people are trying to address. We have set out below some key steps more advanced firms and legal departments are investigating
A) Data Hygiene
Preconditioning, people are giving a lot more thought to matter management to ensure that data is pre-conditioned to ensure quality. We are seeing the increased use of embeddings (industry focused data and wording) to ensure appropriate terminology and context.
B) File Opening
People are recognising the need to really raise the bar in file opening to ensure correct categorisation of matters from a generative AI perspective. People are giving a lot more thought to naming conventions in documents (which is a win win decision) for the same reason.
C) Colour Coding
This is a basic point but people are recognising that we need to draw distinctions between data that has come out of a corroborated database which has been verified and algorithmic data. Some systems have bizarrely not been showing the distinction. Colour coding of algorithmic data is growing as a concept.
D) Governance
1. People are recognising they need to up the ante on supplier due diligence in this area. We have heard some great things from some suppliers but equally some horror stories. As part of this at www.litig.org we and the members have produced a Due Diligence AI Questionnaire which also incorporates points on the European AI Act.
2. People are giving a lot more thought to the requirements of ISO 42001-23AI.
3. Company policies are being enhanced to reflect sensible uses of Gen AI given the risk strategy of each business.
4. There seems to be a rise in interest in AI monitoring tools which allow enterprise supervision and the application of rules/stripping of personal data from outputs.
E) Defensive measures
1. We are seeing a range of defensive measures being looked at, including data poisoning tools. There is a debate - on the one hand it is not right to steal someone else’s copyrighted information but on the other hand is it right to use a data poisoning tool to corrupt the algorithm for other innocent users?
2. We are seeing the rise in interest in data lineage tools to track data origin and trackable data sets.
4. People are trying to develop defences again prompt injection tools which populate your Gen AI model with hidden commands. This could be particularly dangerous when Gen AI tools either become enterprise based or agent based allowing actions to be taken.
5. We are seeing a rise in interest in deep fake screening software, this is particularly of interest to financial institution clients who are quite rightly worried about both fraud and KYC given the recent Hong Kong fraud and developments in voice and video replication Company worker in Hong Kong pays out £20m in deepfake video call scam | Hong Kong | The Guardian.
F) Microsoft
1. People are putting a lot of effort into how they are leveraging Microsoft Gen AI. As a general rule, Microsoft Gen AI keeps your data confidential within your own environment. There are challenges though. Even though you have clarity that information is being generated from your data you still do not know how the underlying LLM has been trained and whether or not it has bias etc (albeit there is now copyright guarantee for Copilot usage).
2. One of the main things that firms are driving in this area is helping people with prompts to ensure that they get the most out of Copilot. They are also doing increased screening to look at the type of information that can be surfaced (e.g. at the moment some information may not be visible albeit it is not locked down meaning it can be inadvertently surfaced when using Gen AI).
3. People are giving a lot of thought about what outputs they want from Azure AI (i.e. if you want to interpret previous deals you put in previous deal documentation, if you want to generate documents you add precedent information. Whatever you do you need to use strong naming conventions, the increased use of prompt gateways to ensure content is pre-configured for particular audiences. Ideally outputs from precedents need to be marked-up to save time for users as opposed to requiring them to review documents from scratch making them less efficient.
G) Business Approaches
1. Again, we are seeing a whole range of sensible approaches emerging. To face facts in the legal market we have a situation where nobody takes any material liability for the output of Gen AI apart from lawyers. We are seeing the rise of discussions relating to liability pricing with different price points for different types of service, each with a different risk profile.
2. We are seeing risk modelling approaches to ensure that projects with an appropriate level of risk are focused on rather than applying Generative AI to everything.
3. We are seeing enhanced terms and conditions being developed and are seeing different information being stored in systems to evidence to insurers that proper supervision is taking place.
4. In relation to email there are currently around 364 billion emails being sent a day in the world. It is worrying that one of the most cited uses of Copilot is to generate more emails. We are now working with some savvy firms who look much more closely at how they use email and putting in place better email practices to ensure that volumes do not rise exponentially driving down efficiency. For more on this please see Exposing 20 Hidden Efficiency Killers in Law - 4. Email and Death by 1000 Cuts — Hyperscale Group Limited
H) Wider Legal Ops approaches
We are seeing people adopt more of a bricolage approach where they link together various systems to effectively develop a “Gen AI machine” with a RAG approach (Retrieval Augmented Generation) verifying outputs. There is also increased demand for playbook and document hygiene tools as firms recognise they are likely to have to review more content and will need help with this. People are recognising that they need to educate people much better with tools such as www.theprofessionalalternative.com
People need to be educated on areas such as prompt, engineering, what is appropriate and not appropriate from a risk perspective, pricing as well as other Legal Ops approaches.
Conclusion
What is becoming apparent is that although Gen AI has many great things to bring to the table it also materially alters the risks to which businesses are exposed. For lawyers this is compounded in that they have a responsibility to get things right (i.e. you go to a lawyer when you want something done correctly and they are often liable if this is not the case).
Many of the AI providers in the market are doing great work and although there is quite rightly a price for their output and investment, very few suppliers take responsibility for the accuracy of their output and we can’t be left in a situation where lawyers are expected to both invest and then spend less chargeable time on work but to still be held accountable for the accuracy when no one else is. We need new approaches.
Importantly however, one thing we need to bear in mind is that there is a whole new array of risks arising, whether this is down to confidentiality of data or deep fake avatars and voice recordings. Also, we need new tools such as playbook and document hygiene tools to help us deal with the increased volumes of content coming our way. Anybody also responsible for risk within businesses perhaps need a holistic AI monitoring platform which allows them to manage core risks, strip out personal data etc. Many things are going to accelerate and we need to equip ourselves with the foundational tools to be ready for this.
To find out more how these areas can be tackled please feel free to contact dereksouthall@hyperscalegroup.com
Hyperscale Group are an Independent Digital, Innovation, Operational Advisory and Implementation Business with over 30 years plus deep market experience. We work for In-House Legal teams and Professional Services Firms all around the world and support them developing and implementing their strategies. We help our clients to make the right things happen.
Our Experience: Click here - Our Services: Click here - Client Testimonials: Click here
How We Work: Click here
For more information, please contact DerekSouthall@hyperscalegroup.com