Key takeaways
- Inventors and Applicants need to ensure that the data used to support their patent applications comply with relevant laws relating to the use of patient data, such as the Privacy Act (Cth)
- De-identification of patient data may be sufficient in most cases, but it will depend on a case by case basis
- A "permitted health situation" exception may be relied upon, but imposes additional regulatory burdens
"Can I use my patient data to train the AI used in my invention?"
Artificial intelligence (AI) and other large language models (LLMs) offer exciting opportunities for innovation in the medical and healthcare space. Recently, the question of securing patent protection for a diagnostic method implementing AI in the medical space has surfaced among clients.
In Australia, patient data is generally considered to be owned by the healthcare service provider that creates the medical record. However, ownership of that patient data does not mean that it can be used in any way by the healthcare provider.
Privacy obligations and qualified exceptions
The Privacy Act 1988 (Cth) (the Act) regulates how personal information is collected, stored and disclosed by government agencies and commercial organisations. The Act does not have any specific provisions relating to the use of patient data in training an AI or LLM.
However, organisations need to comply with general privacy principles as set out in the Act. These principles include ensuring that people are informed when personal information is collected and stored, protecting that personal information (which may involve destroying or de-identifying personal information when it is no longer needed) and limiting the use or disclosure of personal information.
Generally, personal information can only be used or disclosed for the purpose for which that information is collected, called the "primary purpose". However, an exception is provided for using personal information for a "secondary purpose" in limited circumstances. To qualify, consent is usually required to be given by an individual or there was a "reasonable expectation" that the personal information would be used for the secondary purpose.
There is also an exception for a "permitted health situation" in three situations, the most relevant situation to this topic being that the use or disclosure of personal information is "necessary for research, or the compilation or analysis of statistics, relevant to public health or public safety". To rely on this permitted health situation, it must have been impracticable to obtain consent and the use must comply with guidelines under Section 95 of the Act (the Guidelines). The Guidelines require that the research activity or compilation or analysis of statistics must be approved by a Human Research Ethics Committee (HREC), which in turn must be registered by the National Health and Medical Research Council (NHMRC).
Privacy and using patient data
The use of patient data for training AI is likely to fall under a "secondary purpose", since the primary purpose for collecting the patient data is typically for a medical or health related purpose, such as a diagnostic test or screening that creates the medical record.
Therefore, an organisation needs to either obtain patient consent to use their personal data for training an AI model, or establish that using their personal data to train an AI model would be reasonably expected by the patient when collecting that data. It is doubtful that it would be reasonably expected by a patient that their personal data collected for a diagnostic purpose (like a pathology test or X-ray) would also be used to train an AI model. Consequently, using patient data for training an AI would require patient consent.
This is the situation that Australian medical start up Harrison.ai found themselves in when they launched their Annalise.ai diagnostic tool in September 2024. Annalise.ai used AI or a LLM to analyse chest X-rays. The AI or LLM was trained using patient data obtained from I-MED Radiology Network, which operated around 250 radiology clinics in Australia at the time. Atter the launch, questions were raised as to whether patient consent had been obtained to use their X-rays to train Annalise.ai. It sparked an investigation by the Office of the Australian Privacy Commissioner, but did not result in any adverse finding.
In this instance, it appears that de-identification of the patient data may have been sufficient to avoid breaching legal obligations under the Act, since information that is sufficiently de-identified no longer falls within the definition of "personal information" under the Act.
However, the issue for any organisation seeking to use patient data to train their AI or LLM is the degree to which patient data needs to be de-identified to avoid breaching privacy obligations. In some applications, personal information may still be essential for training the AI or LLM, and so compliance with privacy obligations will be required.
There is, however, still a significant risk of re-identification based on the combination of characteristics taken from an individual patient's data that may be used in training the AI. This risk is increased where multiple datasets containing data from the same patient are used together.
An alternative option for organisations is to rely on the "permitted health situation" exception discussed above to permit the use of patient data in AI training without patient consent. The public health requirement would be easily met and training an AI model arguably involves an analysis of statistics. However, there are added regulatory burdens imposed by the need to demonstrate that it was impracticable to obtain consent and to obtain approval by a HREC registered by the NHMRC.
Intersection with patents
These complex issues surrounding the use of patient data in training AI and LLMs also extend to drafting patent applications for inventions incorporating these AI and LLMs.
Patent applications frequently require experimental data to be disclosed to support the patent. For AI-based inventions, there is increasingly a demand for more detail as to how the AI works, which may include the algorithms used, the architecture of the AI or LLM model and the training data.
Consequently, inventors and applicants need to be aware that any use or disclosure of patient data in the training of AI or LLMs need to be sufficiently de-identified or compliant with obligations under the Act before incorporating them into their patent applications.
How can we help?
If your organisation is developing AI or LLM systems and would like advice on the patentability and requirements or assistance, our attorneys are highly skilled and experienced - reach out to Andrew or your Spruson & Ferguson contact for more information.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.