Generative AI Lawyers: Copyright Implications

ai dev

Generative AI is in the courtroom. Artists are suing over training data. Developers are questioning code Copilot generates. The legal landscape is uncertain.

The Core Issue

AI models learn from data. That data includes:

The question: Is training on this data legal? Is the output infringing?

The Arguments

Against AI Training

Artist creates work → Work posted online

                    Scraped for training

                    AI can replicate style

            Artist loses commissions to AI

Arguments:

For AI Training

Arguments:

The Lawsuits

Getty Images vs. Stability AI

Claim: Stability AI scraped millions of Getty images without license.

Evidence: Outputs sometimes include garbled Getty watermarks.

Implications: Commercial image training may require licensing.

Artists vs. Stability/Midjourney

Class action by artists claiming:

GitHub Copilot Lawsuit

Programmers sued GitHub/Microsoft/OpenAI claiming:

The Code Problem

License Violations?

Copilot was trained on public GitHub repos. Some were GPL licensed:

GPL Code  →  Training  →  Copilot suggests similar code

                     Is this GPL-derived?

                 Must the using project be GPL?

Attribution

MIT license requires attribution:

Copyright (c) [year] [author]
Permission is hereby granted...

When Copilot suggests code, where’s the attribution?

Practical Reality

# Copilot might suggest this for "implement fizzbuzz"
def fizzbuzz(n):
    for i in range(1, n + 1):
        if i % 15 == 0:
            print("FizzBuzz")
        elif i % 3 == 0:
            print("Fizz")
        elif i % 5 == 0:
            print("Buzz")
        else:
            print(i)

Is this someone’s copyrighted code? Or so generic it’s uncopyrightable?

Fair Use (US)

Four factors:

  1. Purpose: Commercial or educational?
  2. Nature: Creative or factual work?
  3. Amount: How much was used?
  4. Effect: Market impact on original?

For AI training:

No clear answer yet.

Directive on Copyright (2019):

More restrictive than US fair use.

Implications for Developers

Using AI Tools

Risk LevelUse Case
LowerBoilerplate, common patterns
MediumLibrary-specific code
HigherNovel implementations
HighestCode matching specific projects

Mitigation Strategies

# 1. Review AI suggestions
# Don't blindly accept

# 2. Check for distinctive patterns
# If it looks too specific, search for the source

# 3. Document your process
# Show you made modifications

# 4. Know your company policy
# Many have AI tool policies now

Enterprise Considerations

What’s Coming

Likely Outcomes

  1. Settlement templates: Big AI companies may create licensing frameworks
  2. Opt-out registries: Ways for rights holders to exclude their work
  3. Compensation pools: Like music royalties, but for training data
  4. Disclosure requirements: AI-generated content must be labeled

What Won’t Happen

The genie is out.

Practical Advice

For Developers

1. Treat AI suggestions as starting points
2. Modify substantially before using
3. Run plagiarism checks for critical code
4. Follow company policies

For AI Users

1. Don't prompt for specific artist's styles by name
2. Modify outputs before commercial use
3. Disclose AI assistance when appropriate
4. Stay informed about legal developments

For Companies

1. Establish AI usage policies
2. Train employees on IP considerations
3. Consider indemnification clauses
4. Document AI tool usage

Final Thoughts

The law is catching up to the technology. The outcome will shape AI development for years.

For now: Use AI tools thoughtfully. Understand the uncertainty. Don’t pretend AI output is solely your creation.

The lawsuits will be resolved. The ethical questions will remain.


The code may be generated. The responsibility is still yours.

All posts