256 Kilobytes

Web Scraping, Data Analysis Comments

Type CommentsResponses to top-level threads. Category Web Scraping, Data AnalysisWeb scraping, crawling, and analyzing related data. Tag All

Profile Photo - August R. Garcia

Reply to Analyzing the Web: Downloading the Majestic Million, Setting up SQLite, Crawling the Web, and Generating Reports

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 3 weeks agoMon, 29 Apr 2019 09:10:47 -0700 | Last update 3 weeks agoMon, 29 Apr 2019 09:16:15 -0700

Profile Photo - August R. Garcia

The longest domain names in the Majestic...

MoreThe longest domain names in the Majestic Million, most of which are expired and basically all of which are terrible garbage: 255461  ...
🗨
2
🐏
2
👁
40
Profile Photo - Hash Brown
Profile Photo - Hash Brown

This is excellent!

MoreThis is excellent!
🗨
2
🐏
2
👁
40
Profile Photo - August R. Garcia

Reply to Downloading Bulk "ThisPersonDoesNotExist" Images with Python and urllib2

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 2 months agoThu, 14 Mar 2019 06:40:22 -0700 | Last update 2 months agoThu, 14 Mar 2019 06:41:03 -0700

Profile Photo - August R. Garcia

Also, this is a crime against God:

MoreAlso, this is a crime against God:
🗨
1
🐏
0
👁
31
Profile Photo - August R. Garcia

Reply to How to replace NA with 0 in R?

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 4 months agoMon, 21 Jan 2019 14:28:21 -0800 | Last update 4 months agoMon, 21 Jan 2019 14:50:22 -0800

Profile Photo - August R. Garcia

More some_data_with_nas[is.na(some_data_with_nas)] <- 0 Also, see this duplicate thread: https://www.256kilobytes.com/content/show/900/how-t...
🗨
1
🐏
0
👁
103
Profile Photo - August R. Garcia

Reply to What is ScrapeBox used for?

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 4 months agoWed, 16 Jan 2019 15:43:47 -0800

Profile Photo - August R. Garcia

This is ScrapeBox:

MoreThis is ScrapeBox: http://www.scrapebox.com/ It is a relatively popular tool used to gather data from the internet, as well as to do some...
🗨
1
🐏
0
👁
122
Profile Photo - August R. Garcia

Reply to What is a regex non-capturing group?

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 4 months agoTue, 15 Jan 2019 14:58:59 -0800

Profile Photo - August R. Garcia

As you might expect, a non-capturing gro...

MoreAs you might expect, a non-capturing group is not captured in the match. For example, if you want to match phone numbers, you might require that a whi...
🗨
1
🐏
0
👁
108
Profile Photo - August R. Garcia
Profile Photo - August R. Garcia

Just tested this from a clean Ubuntu ins...

MoreJust tested this from a clean Ubuntu install and it worked fine. Run this from the terminal to install: sudo apt-get install r-base Then...
🗨
1
🐏
0
👁
89
Profile Photo - August R. Garcia
Profile Photo - August R. Garcia

If you need one-off graphs that contain...

MoreIf you need one-off graphs that contain data that is not expected to change, it can be easiest to generate that data locally to image files and to the...
🗨
1
🐏
0
👁
93
Profile Photo - August R. Garcia
Profile Photo - August R. Garcia

Very nice, as they say.

MoreVery nice, as they say. Other scraping methods For those who don’t perhaps have the skills needed to code something there are also other...
🗨
1
🐏
1
👁
805
Profile Photo - Hash Brown

Reply to Scraping results from Google Search

Comments in Web Scraping, Data Analysis | By Hash Brown

Published 5 months agoMon, 03 Dec 2018 13:46:23 -0800

Profile Photo - Hash Brown

If you're going to do this in any re...

MoreIf you're going to do this in any real volume you're going to get IP blocked pretty quick, even on Google Sheets. If I were you I would loo...
🗨
1
🐏
0
👁
99
Profile Photo - August R. Garcia

Reply to I downloaded a bunch of stock photos for testing. Is there a quick way to bulk rename in Bash (or something similar)?

Comments in Web Scraping, Data Analysis | By August R. Garcia

Published 5 months agoSun, 02 Dec 2018 03:46:44 -0800 | Last update 5 months agoMon, 03 Dec 2018 06:33:20 -0800

Profile Photo - August R. Garcia

MoreA script like run from the terminal or a Bash script works for this type of file renaming: INDEX=1; for i in *.jpg; do mv $i ${INDEX}_cover.jpg;...
🗨
1
🐏
0
👁
1