Generative AI Lawyers: Copyright Implications
Generative AI is in the courtroom. Artists are suing over training data. Developers are questioning code Copilot generates. The legal landscape is uncertain.
The Core Issue
AI models learn from data. That data includes:
- Copyrighted images
- Licensed code
- Published text
- Commercial artwork
The question: Is training on this data legal? Is the output infringing?
The Arguments
Against AI Training
Artist creates work → Work posted online
↓
Scraped for training
↓
AI can replicate style
↓
Artist loses commissions to AI
Arguments:
- Training is unauthorized copying
- Outputs can replicate specific styles
- Economic harm to creators
- No consent, no compensation
For AI Training
Arguments:
- Fair use (transformative use)
- Models don’t store copies
- Similar to how humans learn
- Promotes innovation
- Information wants to be free
The Lawsuits
Getty Images vs. Stability AI
Claim: Stability AI scraped millions of Getty images without license.
Evidence: Outputs sometimes include garbled Getty watermarks.
Implications: Commercial image training may require licensing.
Artists vs. Stability/Midjourney
Class action by artists claiming:
- Unauthorized reproduction (training)
- Creating derivative works (outputs)
- Trademark violations
- Right of publicity infringement
GitHub Copilot Lawsuit
Programmers sued GitHub/Microsoft/OpenAI claiming:
- Copilot reproduces licensed code
- Violates open source licenses (GPL, MIT, etc.)
- Removes attribution requirements
The Code Problem
License Violations?
Copilot was trained on public GitHub repos. Some were GPL licensed:
GPL Code → Training → Copilot suggests similar code
↓
Is this GPL-derived?
↓
Must the using project be GPL?
Attribution
MIT license requires attribution:
Copyright (c) [year] [author]
Permission is hereby granted...
When Copilot suggests code, where’s the attribution?
Practical Reality
# Copilot might suggest this for "implement fizzbuzz"
def fizzbuzz(n):
for i in range(1, n + 1):
if i % 15 == 0:
print("FizzBuzz")
elif i % 3 == 0:
print("Fizz")
elif i % 5 == 0:
print("Buzz")
else:
print(i)
Is this someone’s copyrighted code? Or so generic it’s uncopyrightable?
Current Legal Framework
Fair Use (US)
Four factors:
- Purpose: Commercial or educational?
- Nature: Creative or factual work?
- Amount: How much was used?
- Effect: Market impact on original?
For AI training:
- Purpose: Commercial (mostly)
- Nature: Creative works used
- Amount: Entire works used
- Effect: Debatable
No clear answer yet.
EU Copyright
Directive on Copyright (2019):
- Text/data mining exception exists
- But for research, not commercial
- Rights holders can opt out
More restrictive than US fair use.
Implications for Developers
Using AI Tools
| Risk Level | Use Case |
|---|---|
| Lower | Boilerplate, common patterns |
| Medium | Library-specific code |
| Higher | Novel implementations |
| Highest | Code matching specific projects |
Mitigation Strategies
# 1. Review AI suggestions
# Don't blindly accept
# 2. Check for distinctive patterns
# If it looks too specific, search for the source
# 3. Document your process
# Show you made modifications
# 4. Know your company policy
# Many have AI tool policies now
Enterprise Considerations
- Some companies ban Copilot for IP reasons
- Code review should scrutinize AI suggestions
- Consider indemnification (GitHub offers some)
What’s Coming
Likely Outcomes
- Settlement templates: Big AI companies may create licensing frameworks
- Opt-out registries: Ways for rights holders to exclude their work
- Compensation pools: Like music royalties, but for training data
- Disclosure requirements: AI-generated content must be labeled
What Won’t Happen
- Generative AI won’t be banned
- Large models won’t be unwound
- Innovation won’t stop
The genie is out.
Practical Advice
For Developers
1. Treat AI suggestions as starting points
2. Modify substantially before using
3. Run plagiarism checks for critical code
4. Follow company policies
For AI Users
1. Don't prompt for specific artist's styles by name
2. Modify outputs before commercial use
3. Disclose AI assistance when appropriate
4. Stay informed about legal developments
For Companies
1. Establish AI usage policies
2. Train employees on IP considerations
3. Consider indemnification clauses
4. Document AI tool usage
Final Thoughts
The law is catching up to the technology. The outcome will shape AI development for years.
For now: Use AI tools thoughtfully. Understand the uncertainty. Don’t pretend AI output is solely your creation.
The lawsuits will be resolved. The ethical questions will remain.
The code may be generated. The responsibility is still yours.