256 Kilobytes

Downloading Bulk Images: ThisPersonDoesNotExist with Python and urllib2

Articles in Web Scraping, Data Analysis | By August R. Garcia

Published 2 months agoThu, 14 Mar 2019 06:25:36 -0700 | Last update 2 months agoThu, 14 Mar 2019 08:05:08 -0700

TFW no API

2,018 views, 0 RAMs, and 1 comment

You probably remember the website thispersondoesnotexist.com, since it was posted on every content aggregator on the planet like two days ago.

TFW No ThisPersonDoesNotExist API

Search Result Autocomplete for ThisPersonDoesNotExist... API

Downloading Bulk Images/Faces with Python and urllib2 

import urllib2

import time 
from PIL import Image

from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

import os

import csv
import base64

# The directory to put the downloaded files into
d  = "./images/"

# Create the directory, if it doesn't already exist
if not os.path.exists(d):
    os.makedirs(d)


# The url that images are shown at
u  = "https://thispersondoesnotexist.com/image"

# The user agent. Replace this with something descriptive to your use case, probably.
ua = "Some bot: https://www.256kilobytes.com/content/show/4903/"

# The HTTP_REFERER metadata in the request header. Optionally replace with your own URL.
r  = "https://www.256kilobytes.com/content/show/4903/"

for x in range(0,99):
        print ""
        print "// ===== ===== File " + str(x) + " ===== ===== //"
        print "Downloading content from url: " + u;

        req     = urllib2.Request(u)
        req.add_header('Referer'   , r)
        req.add_header('User-Agent', ua)
        
        resp    = urllib2.urlopen(req)
        content = resp.read()

        fn = d + 'fucking-image-' + str(x).zfill(5) + '.jpeg'
        print "Writing to " + fn + "..."
        f = open(fn, "a")
        f.write(content)


        # Scale the image and make a copy of it as a thumbnail
        file, ext = os.path.splitext(fn)
        im        = Image.open(fn)
        #im.thumbnail([128, 128])
        im.thumbnail([256, 256])
        thumb = file + ".thumbnail.jpeg"
        print "Resizing the file to " + thumb + "..."
        im.save(thumb, "JPEG")


        # Base64 Encode the Image's Thumbnail (Optional, to Put Into CSV)
        print "Base64 encoding the thumbnail..."
        with open(thumb, "rb") as image_file:
                b64 = "data:image/jpeg;base64," + base64.b64encode(image_file.read())


        # Put the image details into a cocksucking CSV
        print "Putting the file details into a cocksucking CSV..."
        csv_fn = "output-summary.csv"

        row = [x, fn, thumb, b64]
        with open(csv_fn, 'a') as csvFile: 
                writer = csv.writer(csvFile)
                writer.writerow(row)
        csvFile.close()


        # Wait five seconds between requests to not DDoS/make excessive requests to the server
        time.sleep(5)
        print ""

The Result

Bulk Downloaded People Who Do Not Exist

And so on and so forth.

ThisPersonDoesNotExist Copyright

It's unclear exactly what the copyright is, but here's what has been stated elsewhere on the Internet:

There's a generated sample set of images on the same site as the paper. I'm assuming the guy who set up the site is serving up those sample images.

Here's what the NVIDIA github repo has to say about the datasets:

"All material, excluding the Flickr-Faces-HQ dataset, is made available under Creative Commons BY-NC 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made."

Source:  https://tech.slashdot.org/story/19/02/14/199200/this-person-does-not-exist-website-uses-ai-to-create-realistic-yet-horrifying-faces

And some other related documents:

If anyone has a clearer/more definitive answer regarding copyright, feel free to post it in the comments.

Download more RAM. 🐏 ⨉ 0 Posted by August R. Garcia 2 months ago

Edit History

• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:25 PDT] August R. Garcia (2 months ago)
🕓 Posted at 14 March, 2019 06:25 AM PDT

Profile Photo - August R. Garcia August R. Garcia LARPing as a Sysadmi... Portland, OR
๐Ÿ—Ž 135 ๐Ÿ—จ 706 ๐Ÿ 164
Site Owner

Grahew Mattham

August Garcia is some guy who used to sell Viagra on the Internet. He made this website to LARP as a sysadmin while posting about garbage like user-agent spoofing, spintax, the only good keyboard, virtual assitants from Pakistan, links with the rel="nofollow" attributeproxies, regular expressions, HTML and CSSsin, the developer console, and probably some other trash.


Account created 5 months ago.
135 posts, 706 comments, and 164 RAMs.

Last active 1 day ago:
Posted thread Dear Websites -- Update Your "Powered by vBulletinยฎ Version 3.7.3" Websites; The current version is 5.5.2

Profile Photo - August R. Garcia August R. Garcia LARPing as a Sysadmi... Portland, OR
๐Ÿ—Ž 135 ๐Ÿ—จ 706 ๐Ÿ 164
Site Owner

Also, this is a crime against God:

A Crime Against God or Lesbian Rodeo Clown or Something


Download more RAM. 🐏 ⨉ 0 Posted by August R. Garcia 2 months ago

Edit History

• [2019-03-14 6:40 PDT] August R. Garcia (2 months ago)
• [2019-03-14 6:40 PDT] August R. Garcia (2 months ago)
🕓 Posted at 14 March, 2019 06:40 AM PDT

The CIA wants all code in the cloud under their lock and key. They want to ban compilers and make people think HTML is computer programming. - Terry A. Davis

Post a New Comment

To leave a comment, login to your account or create an account.

Do you like having a good time?

Read Quality Articles

Read some quality articles. If you can manage to not get banned for like five minutes, you can even post your own articles.

View Articles →

Argue with People on the Internet

Use your account to explain why people are wrong on the Internet forum.

View Forum →

Vandalize the Wiki

Or don't. I'm not your dad.

View Wiki →

Ask and/or Answer Questions

If someone asks a terrible question, post a LMGTFY link.

View Answers →

Make Some Money

Hire freelancers and/or advertise your goods and/or services. Hire people directly. We're not a middleman or your dad. Manage your own business transactions.

Register an Account
You can also login to an existing account or recover your password. All use of this site is subject to terms outlined in the terms of service and privacy policy.